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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. CROSS REFERENCE TO RELATED APPLICATIONS 

This application claims the priority benefit of U.S. Provisional Application Serial No. 
5 60/323,739 filed September 19, 2001 entitled "Novel Nucleic Acids and Polypeptides", 
Attorney Docket No. 809, which is a continuation-in-part application of PCT Application 
Serial No. PCT/US00/35017 filed December 22, 2000 entitled "Novel Contigs Obtained 
from Various Libraries", Attorney Docket No. 784CIP3 A/PCT, which in turn is a 
continuation-in-part application of U.S. Application Serial No. 09/552,317 filed April 25, 

1 0 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 
784CIP, which in turn is a continuation-in-part application of U.S. Application Serial No. 
09/488,725 filed January 21, 2000 entitled "Novel Contigs Obtained from Various 
Libraries", Attorney Docket No. 784; PCT Application Serial No. PCT/US01/02623 filed 
January 25, 2001 entitled "Novel Contigs Obtained from Various Libraries", Attorney 

1 5 Docket No. 785CIP3/PCT, which in turn is a continuation-in-part application of U.S. 

Application Serial No. 09/491,404 filed January 25, 2000 entitled "Novel Contigs Obtained 
from Various Libraries", Attorney Docket No. 785; PCT Application Serial No. 
PCT/US01/03800 filed February 5, 2001 entitled "Novel Contigs Obtained from Various 
Libraries", Attorney Docket No. 787CIP3/PCT, which in turn is a continuation-in-part 

20 application of U.S. Application Serial No. 09/560,875 filed April 27, 2000 entitled "Novel 
Contigs Obtained from Various Libraries", Attorney Docket No. 787CIP, which in turn is a 
continuation-in-part application of U.S. Application Serial No. 09/496,914 filed February 03, 
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 787; 
PCT Application Serial No. PCT/US0 1/04927 filed February 26, 2001 entitled "Novel 

25 Contigs Obtained from Various Libraries", Attorney Docket No. 788CIP3/PCT, which in 
turn is a continuation-in-part application of U.S. Application Serial No. 09/577,409 filed 
May 18, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket 
No. 788CIP, which in turn is a continuation-in-part application of U.S. Application Serial 
No. 09/515,126 filed February 28, 2000 entitled "Novel Contigs Obtained from Various 

30 Libraries", Attorney Docket No. 788; PCT Application Serial No. PCT/US01/04941 filed 
March 5, 2001 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket 
No. 789CIP3/PCT, which in turn is a continuation-in-part application of U.S. Application 
Serial No. 09/574,454 filed May 19, 2000 entitled "Novel Contigs Obtained from Various 
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Libraries", Attorney Docket No. 789CIP, which in turn is a continuation-in-part application 
of U.S. Application Serial No. 09/519,705 filed March 07, 2000 entitled "Novel Contigs 
Obtained from Various Libraries", Attorney Docket No. 789; PCT Application Serial No. 
PCT/US01/08631 filed March 30, 2001 entitled "Novel Contigs Obtained from Various 
5 Libraries", Attorney Docket No. 790CIP3/PCT, which in tum is a continuation-in-part 

application of U.S. Application Serial No. 09/649,167 filed August 23, 2000 entitled "Novel 
Contigs Obtained from Various Libraries", Attorney Docket No. 790CIP, which in turn is a 
continuation-in-part application of U.S. Application Serial No. 09/540,217 filed March 31, 

2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 790; 
10 PCT Application Serial No. PCT/US01/08656 filed April 1 8, 2001 entitled "Novel Contigs 

Obtained from Various Libraries", Attorney Docket No. 791CIP3/PCT, which in turn is a 
continuation-in-part application of U.S. Application Serial No. 09/770,160 filed January 26, 

2001 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 
791 CIP, which is in turn a continuation-in-part application of U.S. Application Serial No. 

15 09/552,929 filed April 1 8, 2000 entitled "Novel Contigs Obtained from Various Libraries", 
Attorney Docket No. 791 ; and PCT Application Serial No. PCT/US01/14827 filed May 16, 
2001 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 
792CIP3/PCT, which in turn is a continuation-in-part application of U.S. Application Serial 
No. 09/577,408 filed May 18, 2000 entitled "Novel Contigs Obtained from Various 

20 Libraries", Attorney Docket No. 792; all of which are incorporated herein by reference in 
their entirety. 

2. BACKGROUND OF THE INVENTION 

25 2.1 TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

30 2.2 BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such 
as lymphokines, interferons, circulating soluble factors, chemokines, and interleukins) has 
matured rapidly over the past decade. The now routine hybridization cloning and expression 
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cloning techniques clone novel polynucleotides "directly" in the sense that they rely on 
information directly related to the discovered protein (i.e., partial DNA/amino acid sequence 
of the protein in the case of hybridization cloning; activity of the protein in the case of 
expression cloning). More recent "indirect" cloning techniques such as signal sequence 
5 cloning, which isolates DNA sequences based on the presence of a now well-recognized 
secretory leader sequence motif, as well as various PCR-based or low stringency 
hybridization-based cloning techniques, have advanced the state of the art by making 
available large numbers of DNA/amino acid sequences for proteins that are known to have 
biological activity, for example, by virtue of their secreted nature in the case of leader 

1 0 sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 

techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, 
for example, diagnostics, forensics, gene mapping; identification of mutations responsible 
for genetic disorders or other traits, to assess biodiversity, and to produce many other types 

1 5 of data and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
20 cloned genes or degenerate variants thereof, especially naturally occurring variants such as 
allelic variants, antisense polynucleotide molecules, and antibodies that specifically recognize 
one or more epitopes present on such polypeptides, as well as hybridomas producing such 
antibodies. 

The compositions of the present invention additionally include vectors, including 
25 expression vectors, containing the polynucleotides of the invention, cells genetically engineered 
to contain such polynucleotides and cells genetically engineered to express such 
polynucleotides. 

The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
30 hybridization (SBH), and in some cases, sequences obtained from one or more public 

databases. The invention relates also to the proteins encoded by such polynucleotides, along 
with therapeutic, diagnostic and research utilities for these polynucleotides and proteins. These 
nucleic acid sequences are designated as SEQ ID NO: 1-276, or 553-772 and are provided in 
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the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenine; C is 
cytosine; G is guanine; T is thymine; and N is any of the four bases or unknown. In the amino 
acids provided in the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences 
5 that hybridize to the complement of SEQ ID NO: 1 -276, or 553-772 under stringent 
hybridization conditions; nucleic acid sequences which are allelic variants or species 
homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that 
encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ 
ID NO: 1 -276, or 553-772. A polynucleotide comprising a nucleotide sequence having at least 

10 90% identity to an identifying sequence of SEQ ID NO: 1-276, or 553-772 or a degenerate 
variant or fragment thereof. The identifying sequence can be 100 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1-276, or 553-772. The sequence 
information can be a segment of any one of SEQ £D NO: 1-276, or 553-772 that uniquely 

1 5 identifies or represents the sequence information of SEQ ED NO: 1 -276, or 553-772. 

A collection as used in this application can be a collection of only one polynucleotide. 
The collection of sequence information or identifying information of each sequence can be 
provided on a nucleic acid array. In one embodiment, segments of sequence information are 
provided on a nucleic acid array to detect the polynucleotide that contains the segment. The 

20 array can be designed to detect full-match or mismatch to the polynucleotide that contains the 
segment. The collection can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; 
and host cells or organisms transformed with these expression vectors. Nucleic acid sequences 

25 (or their reverse or direct complements) according to the invention have numerous applications 
in a variety of techniques known to those skilled in the art of molecular biology, such as use as 
hybridization probes, use as primers for PCR, use in an array, use in computer-readable media, 
use in sequencing full-length genes, use for chromosome and gene mapping, use in the 
recombinant production of protein, and use in the generation of anti-sense DNA or RNA, their 

30 chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-276, or 553- 
772 or novel segments or parts of the nucleic acids of the invention are used as primers in 
expression assays that are well known in the art In a particularly preferred embodiment, the 
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nucleic acid sequences of SEQ ID NO: 1-276, or 553-772 or novel segments or parts of the 
nucleic acids provided herein are used in diagnostics for identifying expressed genes or, as well 
known in the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed 
sequence tags for physical mapping of the human genome. 
5 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1 -276, 
or 553-772; a polynucleotide comprising any of the full length protein coding sequences of 
SEQ ID NO: 1 -276, or 553-772; and a polynucleotide comprising any of the nucleotide 
sequences of the mature protein coding sequences of SEQ ID NO: 1-276, or 553-772. The 

1 0 polynucleotides of the present invention also include, but are not limited to, a polynucleotide 
that hybridizes under stringent hybridization conditions to (a) the complement of any one of the 
nucleotide sequences set forth in SEQ ID NO: 1-276, or 553-772; (b) a nucleotide sequence 
encoding any one of the amino acid sequences set forth in SEQ ID NO: 1 -276, or 553-772; (c) a 
polynucleotide which is an allelic variant of any polynucleotides recited above; (d) a 

1 5 polynucleotide which encodes a species homologue (e.g. orthologs) of any of the proteins 

recited above; or (e) a polynucleotide that encodes a polypeptide comprising a specific domain 
or truncation of any of the polypeptides comprising an amino acid sequence set forth in SEQ ID 
NO: 277-552, or 773-992, or Tables 3, 4A, 4B, 5, 6, or 8. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

20 comprising any of the amino acid sequences set forth in the Sequence Listing; or the 
corresponding full length or mature protein. Polypeptides of the invention also include 
polypeptides with biological activity that are encoded by (a) any of the polynucleotides having 
a nucleotide sequence set forth in SEQ ID NO: 1-276, or 553-772; or (b) polynucleotides that 
hybridize to the complement of the polynucleotides of (a) under stringent hybridization 

25 conditions. Biologically active variants of any of the polypeptide sequences in the Sequence 
Listing, and "substantial equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 
85%, 90%, 95%, 98% or 99% amino acid sequence identity) that preferably retain biological 
activity are also contemplated. The polypeptides of the invention may be wholly or partially 
chemically synthesized but are preferably produced by recombinant means using the genetically 

30 engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such 
as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 
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The invention also provides host cells transformed or transfected with a 
polynucleotide of the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
5 under conditions permitting expression of the desired polypeptide, and purifying the 

polypeptide from the culture or from the host cells. Preferred embodiments include those in 
which the protein produced by such processes is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology. These techniques 

10 include use as hybridization probes, use as oligomers, or primers, for PCR, use for 

chromosome and gene mapping, use in the recombinant production of protein, and use in 
generation of anti-sense DNA or RNA, their chemical analogs and the like. For example, 
when the expression of an mRNA is largely restricted to a particular cell or tissue type, 
polynucleotides of the invention can be used as hybridization probes to detect the presence 

15 of the particular cell or tissue mRNA in a sample using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

20 The polypeptides according to the invention can be used in a variety of conventional 

procedures and methods that are currently applied to other proteins. For example, a 
polypeptide of the invention can be used to generate an antibody that specifically binds the 
polypeptide. Such antibodies, particularly monoclonal antibodies, are useful for detecting or 
quantitating the polypeptide in tissue. The polypeptides of the invention can also be used as 

25 molecular weight markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical 
condition which comprises the step of administering to a mammalian subject a 
therapeutically effective amount of a composition comprising a polypeptide of the present 
invention and a pharmaceutical^ acceptable carrier. 

30 In particular, the polypeptides and polynucleotides of the invention can be utilized, 

for example, in methods for the prevention and/or treatment of disorders involving aberrant 
protein expression or biological activity. 



WO 03/025148 PCT/US02/29964 

7 

The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for 
example, be utilized as part of prognostic and diagnostic evaluation of disorders as recited 
herein and for the identification of subjects exhibiting a predisposition to such conditions. 
5 The invention provides a method for detecting the polynucleotides of the invention in a 
sample, comprising contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of interest for a period sufficient to form the complex and 
under conditions sufficient to form a complex and detecting the complex such that if a 
complex is detected, the polynucleotide of interest is detected. The invention also provides a 

1 0 method for detecting the polypeptides of the invention in a sample comprising contacting the 
sample with a compound that binds to and forms a complex with the polypeptide under 
conditions and for a period sufficient to form the complex and detecting the formation of the 
complex such that if a complex is formed, the polypeptide is detected. 

The invention also provides kits comprising polynucleotide probes and/or 

1 5 monoclonal antibodies, and optionally quantitative standards, for carrying out methods of the 
invention. Furthermore, the invention provides methods for evaluating the efficacy of drugs, 
and monitoring the progress of patients, involved in clinical trials for the treatment of 
disorders as recited above. 

The invention also provides methods for the identification of compounds that 

20 modulate (i.e., increase or decrease) the expression or activity of the polynucleotides and/or 
polypeptides of the invention. Such methods can be utilized, for example, for the 
identification of compounds that can ameliorate symptoms of disorders as recited herein. 
Such methods can include, but are not limited to, assays for identifying compounds and 
other substances that interact with (e.g., bind to) the polypeptides of the invention. The 

25 invention provides a method for identifying a compound that binds to the polypeptides of the 
invention comprising contacting the compound with a polypeptide of the invention in a cell 
for a time sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and detecting the complex by detecting 
the reporter gene sequence expression such that if expression of the reporter gene is detected 

30 the compound that binds to a polypeptide of the invention is identified. 

The methods of the invention also provide methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals 
exhibiting symptoms or tendencies. In addition, the invention encompasses methods for 
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treating diseases or disorders as recited herein comprising administering compounds and 
other substances that modulate the overall activity of the target gene products. Compounds 
and other substances can affect such modulation either on the level of target gene/protein 
expression or target protein activity. 
5 The polypeptides of the present invention and the polynucleotides encoding them are 

also useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Tables 2A and 2B); for which 
they have a signature region (as set forth in Table 3); or for which they have homology to a 
gene family (as set forth in Tables 4A and 4B). If no homology is set forth for a sequence, 
1 0 then the polypeptides and polynucleotides of the present invention are useful for a variety of 
applications, as described herein, including use in arrays for detection. 



4. DETAILED DESCRIPTION OF THE INVENTION 



15 4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms 
"a", "an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 

20 invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or "immunological activity" refers to the capability of 
the natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

25 The term "activated cells" as used in this application are those cells which are 

engaged in extracellular or intracellular membrane trafficking, including the export of 
secretory or enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence S'-AGTO' binds to the 

30 complementary sequence 3'-TCA-5\ Complementarity between two single-stranded 

molecules may be "partial" such that only certain portion(s) of the nucleic acids bind or it 
may be "complete" such that total complementarity exists between the single stranded 
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molecules. The degree of complementarity between the nucleic acid strands has significant 
effects on the efficiency and strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ 
5 line stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a 
steady and continuous source of germ cells for the production of gametes. The term 
"primordial germ cells (PGCs)" refers to a small population of cells set aside from other cell 
lineages particularly from the yolk sac, mesenteries, or gonadal ridges during embryogenesis 
that have the potential to differentiate into germ cells and other cells. PGCs are the source 

1 0 from which GSCs and ES cells are derived. The PGCs, the GSCs and the ES cells are 

capable of self-renewal. Thus these cells not only populate the germ line and give rise to a 
plurality of terminally differentiated cells that comprise the adult specialized organs, but are 
able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides 

15 which modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. 
EMFs include, but are not limited to, promoters, and promoter modulating sequences 
(inducible elements). One class of EMFs are nucleic acid fragments which induce the 

20 expression of an operably linked ORF in response to a specific regulatory factor or 
physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonucleotide" are used interchangeably and refer to a heteropolymer of nucleotides or 
the sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or 

25 synthetic origin which may be single-stranded or double-stranded and may represent the 

sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like 
material, in the sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and 
N is A, C, G, or T (U) or unknown. It is contemplated that where the polynucleotide is 
RNA, the T (thymine) in the sequences provided herein is substituted with U (uracil). 

30 Generally, nucleic acid segments provided by this invention may be assembled from 
fragments of the genome and short oligonucleotide linkers, or from a series of 
oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic acid which is 
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capable of being expressed in a recombinant transcriptional unit comprising regulatory 
elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of 
5 nucleotide residues which are at least about 5 nucleotides, more preferably at least about 7 
nucleotides, more preferably at least about 9 nucleotides, more preferably at least about 1 1 
nucleotides and most preferably at least about 17 nucleotides. The fragment is preferably 
less than about 500 nucleotides, preferably less than about 200 nucleotides, more preferably 
less than about 100 nucleotides, more preferably less than about SO nucleotides and most 

10 preferably less than 30 nucleotides. Preferably the probe is from about 6 nucleotides to 
about 200 nucleotides, preferably from about 15 to about 50 nucleotides, more preferably 
from about 17 to 30 nucleotides and most preferably from about 20 to 25 nucleotides. 
Preferably the fragments can be used in polymerase chain reaction (PCR), various 
hybridization procedures or microarray procedures to identify or amplify identical or related 

15 parts of mRNA or DNA molecules. A fragment or segment may uniquely identify each 
polynucleotide sequence of the present invention. Preferably the fragment comprises a 
sequence substantially similar to any one of SEQ ID NO: 1-276, or 553-772. 

Probes may, for example, be used to determine whether specific mRNA molecules 
are present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal 

20 DNA as described by Walsh et al. (Walsh, P.S. et aL, 1 992, PCR Methods Appl 1 :24 1 -250). 
They may be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods 
well known in the art. Probes of the present invention, their preparation and/or labeling are 
elaborated in Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratory, NY; or Ausubel, F.M- et al., 1989, Current Protocols in 

25 Molecular Biology, John Wiley & Sons, New York NY, both of which are incorporated 
herein by reference in their entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1-276, or 553-772. The 
sequence information can be a segment of any one of SEQ ID NO: 1-276, or 553-772 that 

30 uniquely identifies or represents the sequence information of that sequence of SEQ ID NO: 
1 -276, or 553-772, or those segments identified in Tables 3, 4A, 4B, 5, 6, or 8. One such 
segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 
mer is fully matched in the human genome is 1 in 300. In the human genome, there are three 
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billion base pairs in one set of chromosomes. Because 4 20 possible twenty-mers exist, there 
are 300 times more twenty-mers than there are base pairs in a set of human chromosomes. 
Using the same analysis, the probability for a seventeen-mer to be folly matched in the 
human genome is approximately I in 5. When these segments are used in arrays for 
5 expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because 
expressed sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment 
can be a twenty-five mer. The probability that the twenty-five mer would appear in a human 

1 0 genome with a single mismatch is calculated by multiplying the probability for a foil match 
(l-i-4 25 ) times the increased probability for mismatch at each nucleotide position (3 x 25). The 
probability that an eighteen mer with a single mismatch can be detected in an array for 
expression studies is approximately one in five. The probability that a twenty-mer with a single 
mismatch can be detected in a human genome is approximately one in five. 

15 The term "open reading frame," ORF, means a series of nucleotide triplets coding for 

amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related 
nucleic acid sequences. For example, a promoter is operably associated or operably linked 
with a coding sequence if the promoter controls the transcription of the coding sequence. 

20 While operably linked nucleic acid sequences can be contiguous and in the same reading 

frame, certain genetic elements e.g. repressor genes are not contiguously linked to the coding 
sequence but still control transcription/translation of the coding sequence. 

The term "pluripotenf refers to the capability of a cell to differentiate into a number 
of differentiated cell types that are present in an adult organism. A pluripotent cell is 

25 restricted in its differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence'* refer to an 
oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and to naturally 
occurring or synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a 
stretch of amino acid residues of at least about 5 amino acids, preferably at least about 7 

30 amino acids, more preferably at least about 9 amino acids and most preferably at least about 
1 7 or more amino acids. The peptide preferably is not greater than about 200 amino acids, 
more preferably less than 150 amino acids and most preferably less than 100 amino acids. 
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Preferably the peptide is from about 5 to about 200 amino acids. To be active, any 
polypeptide must have sufficient length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells 
that have not been genetically engineered and specifically contemplates various polypeptides 
5 arising from post-translational modifications of the polypeptide including, but not limited to, 
acetylation, carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the 
full-length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a 
10 peptide or protein without a signal or leader sequence. The <4 mature protein portion" means 
that portion of the protein which does not include a signal or leader sequence. The peptide 
may have been produced by processing in the cell which removes any leader/signal 
sequence. The mature protein portion may or may not include the initial methionine residue. 
The methionine residue may be removed from the protein during processing in the cell. The 
1 5 peptide may be produced synthetically or the protein may have been produced using a 
polynucleotide only encoding for the mature protein coding sequence. 

The term "derivative" refers to polypeptides chemically modified by such techniques 
as ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
20 substitution by chemical synthesis of amino acids such as ornithine, which do not normally 
occur in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, 
e g. 9 recombinant DNA techniques. Guidance in determining which amino acid residues 
25 may be replaced, added or deleted without abolishing activities of interest, may be found by 
comparing the sequence of the particular polypeptide with that of homologous peptides and 
minimizing the number of amino acid sequence changes made in regions of high homology 
(conserved regions) or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may 
30 be synthesized or selected by making use of the "redundancy" in the genetic code. Various 
codon substitutions, such as the silent changes which produce various restriction sites, may 
be introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be 
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reflected in the polypeptide or domains of other peptides added to the polypeptide to modify 
the properties of any part of the polypeptide, to change characteristics such as ligand-binding 
affinities, interchain affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
5 another amino acid having similar structural and/or chemical properties, Le. , conservative 
amino acid replacements. "Conservative" amino acid substitutions may be made on the 
basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the 
amphipathic nature of the residues involved. For example, nonpolar (hydrophobic) amino 
acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and 

1 0 methionine; polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, 
asparagine, and glutamine; positively charged (basic) amino acids include arginine, lysine, 
and histidine; and negatively charged (acidic) amino acids include aspartic acid and glutamic 
acid. "Insertions" or "deletions" are preferably in the range of about 1 to 20 amino acids, 
more preferably 1 to 10 amino acids. The variation allowed may be experimentally 

15 determined by systematically making insertions, deletions, or substitutions of amino acids in 
a polypeptide molecule using recombinant DNA techniques and assaying the resulting 
recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such 

20 alterations can, for example, alter one or more of the biological functions or biochemical 
characteristics of the polypeptides of the invention. For example, such alterations may 
change polypeptide characteristics such as ligand-binding affinities, interchain affinities, or 
degradation/turnover rate. Further, such alterations can be selected so as to generate 
polypeptides that are better suited for expression, scale up and the like in the host cells 

25 chosen for expression. For example, cysteine residues can be deleted or substituted with 
another amino acid residue in order to eliminate disulfide bridges. 

The terms "purified" or "substantially purified'* as used herein denotes that the 
indicated nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, eg., polynucleotides, proteins, and the like. In one embodiment, the 

30 polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, 

more preferably at least 99% by weight, of the indicated biological macromolecules present 
(but water, buffers, and other small molecules, especially molecules having a molecular 
weight of less than 1000 daltons, can be present). 
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The term "isolated" as used herein refers to a nucleic acid or polypeptide separated 
from at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic 
acid or polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide 
is found in the presence of (if anything) only a solvent, buffer, ion, or other component 
5 normally present in a solution of the same. The terms "isolated" and "purified" do not 
encompass nucleic acids or polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or 
mammalian) expression systems. "Microbial" refers to recombinant polypeptides or proteins 

1 0 made in bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant 
microbial" defines a polypeptide or protein essentially free of native endogenous substances 
and unaccompanied by associated native glycosylation. Polypeptides or proteins expressed 
in most bacterial cultures, e.g., £. coli, will be free of glycosylation modifications; 
polypeptides or proteins expressed in yeast will have a glycosylation pattern in general 

15 different from those expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or 
virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression 
vehicle can comprise a transcriptional unit comprising an assembly of (1) a genetic element 
or elements having a regulatory role in gene expression, for example, promoters or 

20 enhancers, (2) a structural or coding sequence which is transcribed into mRNA and 

translated into protein, and (3) appropriate transcription initiation and termination sequences. 
Structural units intended for use in yeast or eukaryotic expression systems preferably include 
a leader sequence enabling extracellular secretion of translated protein by a host cell. 
Alternatively, where recombinant protein is expressed without a leader or transport 

25 sequence, it may include an amino terminal methionine residue. This residue may or may 
not be subsequently cleaved from the expressed recombinant protein to provide a final 
product. 

The term "recombinant expression system" means host cells which have stably 
integrated a recombinant transcriptional unit into chromosomal DNA or carry the 
30 recombinant transcriptional unit extrachromosomally. Recombinant expression systems as 
defined herein will express heterologous polypeptides or proteins upon induction of the 
regulatory elements linked to the DNA segment or synthetic gene to be expressed. This term 
also means host cells which have stably integrated a recombinant genetic element or 
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elements having a regulatory role in gene expression, for example, promoters or enhancers. 
Recombinant expression systems as defined herein will express polypeptides or proteins 
endogenous to the cell upon induction of the regulatory elements linked to the endogenous 
DNA segment or gene to be expressed. The cells can be prokaryotic or eukaryotic. 
5 The term "secreted" includes a protein that is transported across or through a 

membrane, including transport as a result of signal sequences in its amino acid sequence 
when it is expressed in a suitable host cell. "Secreted" proteins include without limitation 
proteins secreted wholly {e.g., soluble proteins) or partially (e.g., receptors) from the cell in 
which they are expressed. "Secreted" proteins also include without limitation proteins that 

1 0 are transported across the membrane of the endoplasmic reticulum. "Secreted" proteins are 
also intended to include proteins containing non-typical signal sequences (e.g. Interleukin-1 
Beta, see Krasney, P.A. and Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors 
released from damaged cells (e.g. Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. 
(1998) Annu. Rev. Immunol. 16:27-55) 

1 5 Where desired, an expression vector may be designed to contain a "signal or leader 

sequence" which will direct the polypeptide through the membrane of a cell. Such a 
sequence may be naturally present on the polypeptides of the present invention or provided 
from heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in 

20 the art as stringent. Stringent conditions can include highly stringent conditions (i.e., 

hybridization to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 
mM EDTA at 65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent 
conditions (i.e., washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization 
conditions are described herein in the examples. 

25 In instances of hybridization of deoxyoligonucleotides, additional exemplary 

stringent hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate 
at 37°C (for 14-base oligonucleotides), 48°C (for 17-base oligonucleotides), 55°C (for 20- 
base oligonucleotides), and 60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" or "substantially similar" can refer both to 

30 nucleotide and amino acid sequences, for example a mutant sequence, that varies from a 
reference sequence by one or more substitutions, deletions, or additions, the net effect of 
which does not result in an adverse functional dissimilarity between the reference and 
subject sequences. Typically, such a substantially equivalent sequence varies from one of 
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those listed herein by no more than about 35% (Le. y the number of individual residue 
substitutions, additions, and/or deletions in a substantially equivalent sequence, as compared 
to the corresponding reference sequence, divided by the total number of residues in the 
substantially equivalent sequence is about 0.35 or less). Such a sequence is said to have 
5 65% sequence identity to the listed sequence. In one embodiment, a substantially 

equivalent, e.g., mutant, sequence of the invention varies from a listed sequence by no more 
than 30% (70% sequence identity); in a variation of this embodiment, by no more than 25% 
(75% sequence identity); and in a further variation of this embodiment, by no more than 
20% (80% sequence identity) and in a further variation of this embodiment, by no more than 

10 10% (90% sequence identity) and in a further variation of this embodiment, by no more that 
5% (95% sequence identity). Substantially equivalent, mutant, amino acid sequences 
according to the invention preferably have at least 80% sequence identity with a listed amino 
acid sequence, more preferably at least 85% sequence identity, more preferably at least 90% 
sequence identity, more preferably at least 95% sequence identity, more preferably at least 

15 98% sequence identity, and most preferably at least 99% sequence identity. Substantially 
equivalent nucleotide sequence of the invention can have lower percent sequence identities, 
taking into account, for example, the redundancy or degeneracy of the genetic code. 
Preferably, the nucleotide sequence has at least about 65% identity, more preferably at least 
about 75% identity, more preferably at least about 80% sequence identity, more preferably at 

20 least 85% sequence identity, more preferably at least 90% sequence identity, more preferably 
at least about 95% sequence identity, more preferably at least 98% sequence identity, and 
most preferably at least 99% sequence identity. For the purposes of the present invention, 
sequences having substantially equivalent biological activity and substantially equivalent 
expression characteristics are considered substantially equivalent. For the purposes of 

25 determining equivalence, truncation of the mature sequence (e.g. 9 via a mutation which 
creates a new stop codon) should be disregarded. Sequence identity may be determined, 
e.g., using the Jotun Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). 
Identity between sequences can also be determined by other methods known in the art, e.g. 
by varying hybridization conditions. 

30 The term 'totipotent" refers to the capability of a cell to differentiate into all of the 

cell types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that 
the DNA is replicable, either as an extrachromosomal element, or by chromosomal 
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integration. The term "transfection" refers to the taking up of an expression vector by a 
suitable host cell, whether or not any coding sequences are in fact expressed. The term 
"infection" refers to the introduction of nucleic acids into a suitable host cell by use of a 
virus or viral vector. 

5 As used herein, an "uptake modulating fragment," UMF, means a series of 

nucleotides which mediate the uptake of a linked DNA fragment into a cell. UMFs can be 
readily identified using known UMFs as a target sequence or target motif with the 
computer-based systems described below. The presence and activity of a UMF can be 
confirmed by attaching the suspected UMF to a marker sequence. The resulting nucleic acid 
10 molecule is then incubated with an appropriate host under appropriate conditions and the 
uptake of the marker sequence is determined. As described above, a UMF will increase the 
frequency of uptake of a linked marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless 
the context dictates otherwise. 

15 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising 
the nucleotide sequences of SEQ ID NO: 1-276, or 553-772; a polynucleotide encoding any 

20 one of the peptide sequences of SEQ ED NO: 1-276, or 553-772; and a polynucleotide 
comprising the nucleotide sequence encoding the mature protein coding sequence of the 
polynucleotides of any one of SEQ ID NO: 1-276, or 553-772. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID 

25 NO: 1-276, or 553-772; (b) nucleotide sequences encoding any one of the amino acid 
sequences set forth in the Sequence Listing, or Table 8; (c) a polynucleotide which is an 
allelic variant of any polynucleotide recited above; (d) a polynucleotide which encodes a 
species homologue of any of the proteins recited above; or (e) a polynucleotide that encodes 
a polypeptide comprising a specific domain or truncation of the polypeptides of SEQ ED NO: 

30 277-552, or 773-992 (for example, as set forth in Tables 3, 4A, 4B, 5, 6, or 8). Domains of 
interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor-like 
polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic domains, 
or combinations thereof; domains in immunoglobulin-like proteins include the variable 
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immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 

The polynucleotides of the invention include naturally occurring or wholly or 
5 partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The 

polynucleotides may include entire coding region of the cDNA or may represent a portion of 
the coding region of the cDNA. 

The present invention also provides genes corresponding to the cDNA sequences 
disclosed herein. The corresponding genes can be isolated in accordance with known methods ' 

1 0 using the sequence information disclosed herein. Such methods include the preparation of 
probes or primers from the disclosed sequence information for identification and/or 
amplification of genes in appropriate genomic libraries or other sources of genomic materials. 
Further 5' and 3* sequence can be obtained using methods known in the art. For example, lull 
length cDNA or genomic DNA that corresponds to any of the polynucleotides of SEQ ID NO: 

1 5 1 -276, or 553-772 can be obtained by screening appropriate cDNA or genomic DNA libraries 
under suitable hybridization conditions using any of the polynucleotides of SEQ ID NO: 1 -276, 
or 553-772 or a portion thereof as a probe. Alternatively, the polynucleotides of SEQ ID NO: 
1 -276, or 553-772 may be used as the basis for suitable primer(s) that allow identification 
and/or amplification of genes in appropriate genomic DNA or cDNA libraries. 

20 The nucleic acid sequences of the invention can be assembled from ESTs and sequences 

(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence 
information, representative fragment or segment information, or novel segment information for 
the full-length gene. 

25 The polynucleotides of the invention also provide polynucleotides including 

nucleotide sequences that are substantially equivalent to the polynucleotides recited above. 
Polynucleotides according to the invention can have, eg., at least about 65%, at least about 
70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least 
about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 91%, 92%, 93%, 94%, 

30 and even more typically at least about 95%, 96%, 97%, 98%, 99% sequence identity to a 
polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic 
acid sequence fragments that hybridize under stringent conditions to any of the nucleotide 



WO 03/025148 PCT/US02/29964 

19 

sequences of SEQ ID NO: 1-276, or 553-772, or complements thereof, which fragment is 
greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 
nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 
nucleotides or more that are selective for (i.e. specifically hybridize to) any one of the 
5 polynucleotides of the invention are contemplated. Probes capable of specifically 

hybridizing to a polynucleotide can differentiate polynucleotide sequences of the invention 
from other polynucleotide sequences in the same family of genes or can differentiate human 
genes from genes of other species, and are preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
1 0 specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided in SEQ ID NO: 1- 
276, or 553-772, a representative fragment thereof, or a nucleotide sequence at least 90% 
identical, preferably 95% identical, to SEQ ID NO: 1-276, or 553-772 with a sequence from 
another isolate of the same species. Furthermore, to accommodate codon variability, the 
1 5 invention includes nucleic acid molecules coding for the same amino acid sequences as do the 
specific ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of 
one codon for another codon that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology results for the nucleic acids of the present invention, 
including SEQ ID NO: 1 -276, or 553-772 can be obtained by searching a database using an 
20 algorithm or a program. Preferably, a BLAST (Basic Local Alignment Search Tool) program is 
used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using FASTXY algorithm may be performed. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are 
25 also provided by the present invention. Species homologs may be isolated and identified by 
making suitable probes or primers from the sequences provided herein and screening a 
suitable nucleic acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which 
30 also encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
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prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic 
acids encoding the amino acid sequence variants are preferably constructed by mutating the 
5 polynucleotide to encode an amino acid sequence that does not occur in nature. These 
nucleic acid alterations can be made at sites that differ in the nucleic acids from different 
species (variable positions) or in highly conserved regions (constant regions). Sites at such 
locations will typically be modified in series, e.g., by substituting first with conservative 
choices (e.g., hydrophobic amino acid to a different hydrophobic amino acid) and then with 

1 0 more distant choices (e.g., hydrophobic amino acid to a charged amino acid), and then 
deletions or insertions may be made at the target site. Amino acid sequence deletions 
generally range from about 1 to 30 residues, preferably about 1 to 10 residues, and are 
typically contiguous. Amino acid insertions include amino- and/or carboxyl-terminal 
fusions ranging in length from one to one hundred or more residues, as well as intrasequence 

1 5 insertions of single or multiple amino acid residues. Intrasequence insertions may range 
generally from about 1 to 10 amino residues, preferably from 1 to 5 residues. Examples of 
terminal insertions include the heterologous signal sequences necessary for secretion or for 
intracellular targeting in different host cells and sequences such as FLAG or poly-histidine 
sequences useful for purifying the expressed protein. 

20 In a preferred method, polynucleotides encoding the novel amino acid sequences are 

changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter 
a polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of 
the site of being changed. In general, the techniques of site-directed mutagenesis are well 

25 known to those of skill in the art and this technique is exemplified by publications such as, 
Edelman et al., DNA 2:183 (1983). A versatile and efficient method for producing 
site-specific changes in a polynucleotide sequence was published by Zoller and Smith, 
Nucleic Acids Res. 10:6487-6500 (1 982). PCR may also be used to create amino acid 
sequence variants of the novel nucleic acids. When small amounts of template DNA are 

30 used as starting material, primer(s) that differs slightly in sequence from the corresponding 
region in the template DNA can generate the desired amino acid variant. PCR amplification 
results in a population of product DNA fragments that differ from the polynucleotide 
template encoding the polypeptide at the position specified by the primer. The product DNA 
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fragments replace the corresponding region in the plasmid and this gives a polynucleotide 
encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques 
5 well known in the art, such as, for example, the techniques in Sambrook et al., supra, and 
Current Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of 
the genetic code, other DNA sequences which encode substantially the same or a 
functionally equivalent amino acid sequence may be used in the practice of the invention for 
the cloning and expression of these novel nucleic acids. Such DNA sequences include those 
10 which are capable of hybridizing to the appropriate novel nucleic acid sequence under 
stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention could be 
* used to generate polynucleotides encoding chimeric or fusion proteins comprising one or 
more domains of the invention and heterologous protein sequences. 

1 5 The polynucleotides of the invention additionally include the complement of any of 

the polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, 
amplified, or synthetic) or RNA. Methods and algorithms for obtaining such 
polynucleotides are well known to those of skill in the art and can include, for example, 
methods for determining hybridization conditions that can routinely isolate polynucleotides 

20 of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1 -276, or 553-772, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that 
direct the expression of that nucleic acid, or a functional equivalent thereof, in appropriate 

25 host cells. Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et 
al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). 
Useful nucleotide sequences for joining to polynucleotides include an assortment of vectors, 

30 e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well 
known in the art. Accordingly, the invention also provides a vector including a 
polynucleotide of the invention and a host cell containing the polynucleotide. In general, the 
vector contains an origin of replication functional in at least one organism, convenient 
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restriction endonuclease sites, and a selectable marker for the host cell. Vectors according to 
the invention include expression vectors, replication vectors, probe generation vectors, and 
sequencing vectors. A host cell according to the invention can be a prokaryotic or 
eukaryotic cell and can be a unicellular organism or part of a multicellular organism. 
5 The present invention further provides recombinant constructs comprising a nucleic 

acid having any of the nucleotide sequences of SEQ ID NO: 1-276, or 553-772 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 
which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-276, or 553- 

1 0 772 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a 

vector comprising one of the ORFs of the present invention, the vector may further comprise 
regulatory sequences, including for example, a promoter, operably linked to the ORF. Large 
numbers of suitable vectors and promoters are known to those of skill in the art and are 
commercially available for generating the recombinant constructs of the present invention. 

1 5 The following vectors are provided by way of example: Bacterial: pBs, phagescript, 
PsiX174, pBluescript SK, pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene), 
pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); Eukaryotic: pWLneo, 
pSV2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an expression 

20 control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al., 
Nucleic Acids Res. 1 9, 4485-4490 (1 991 ), in order to produce the protein recombinantly. 
Many suitable expression control sequences are known in the art. General methods of 
expressing recombinant proteins are also known and are exemplified in R. Kaufman, 
Methods in Enzymology 185, 537-566 (1990). As defined herein "operably linked" means 

25 that the isolated polynucleotide of the invention and an expression control sequence are 
situated within a vector or cell in such a way that the protein is expressed by a host cell 
which has been transformed (transfected) with the ligated polynucleotide/expression control 
sequence. 

Promoter regions can be selected from any desired gene using CAT 
30 (chloramphenicol transferase) vectors or other vectors with selectable markers. Two 

appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include 
lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate 
early, HS V thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse 
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metal lothionein-I. Selection of the appropriate vector and promoter is well within the level 
of ordinary skill in the art. Generally, recombinant expression vectors will include origins of 
replication and selectable markers permitting transformation of the host cell, e.g., the 
ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a promoter derived 
5 from a highly expressed gene to direct transcription of a downstream structural sequence. 
Such promoters can be derived from operons encoding glycolytic enzymes such as 3- 
phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, among 
others. The heterologous structural sequence is assembled in appropriate phase with 
translation initiation and termination sequences, and preferably, a leader sequence capable of 

1 0 directing secretion of translated protein into the periplasmic space or extracellular medium. 
Optionally, the heterologous sequence can encode a fusion protein including an amino 
terminal identification peptide imparting desired characteristics, e.g., stabilization or 
simplified purification of expressed recombinant product. Useful expression vectors for 
bacterial use are constructed by inserting a structural DNA sequence encoding a desired 

1 5 protein together with suitable translation initiation and termination signals in operable 

reading phase with a functional promoter. The vector will comprise one or more phenotypic 
selectable markers and an origin of replication to ensure maintenance of the vector and to, if 
desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 

20 within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may 
also be employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial 
use can comprise a selectable marker and bacterial origin of replication derived from 
commercially available plasmids comprising genetic elements of the well known cloning 

25 vector pBR322 (ATCC 3701 7). Such commercial vectors include, for example, pKK223-3 
(Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, 
USA). These pBR322 "backbone" sections are combined with an appropriate promoter and 
the structural sequence to be expressed. Following transformation of a suitable host strain 
and growth of the host strain to an appropriate cell density, the selected promoter is induced 

30 or derepressed by appropriate means (e.g., temperature shift or chemical induction) and cells 
are cultured for an additional period. Cells are typically harvested by centrifugation, 
disrupted by physical or chemical means, and the resulting crude extract retained for further 
purification. 
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Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et al, Nat. Biotech 17, 870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
5 following injection, and preferably intra-muscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form 
of naked DNA. 

4.3 ANTISENSE 

10 Another aspect of the invention pertains to isolated antisense nucleic acid molecules 

that are hybridizable to or complementary to the nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NO: 1 -276, or 553-772, or fragments, analogs or derivatives 
thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is complementary 
to a "sense" nucleic acid encoding a protein, e.g., complementary to the coding strand of a 

1 5 double-stranded cDNA molecule or complementary to an mRNA sequence. In specific 
aspects, antisense nucleic acid molecules are provided that comprise a sequence 
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire coding 
strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, homologs, 
derivatives and analogs of a protein of any of SEQ ID NO: 1-276, or 553-772 or antisense 

20 nucleic acids complementary to a nucleic acid sequence of SEQ ID NO: 1 -276, or 553-772 
are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 
region" of the coding strand of a nucleotide sequence of the invention. The term "coding 
region" refers to the region of the nucleotide sequence comprising codons which are 

25 translated into amino acid residues. In another embodiment, the antisense nucleic acid 

molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence 
of the invention. The term "noncoding region" refers to 5* and 3* sequences that flank the 
coding region that are not translated into amino acids (Le. 9 also referred to as 5* and 3* 
untranslated regions). 

30 Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., 

SEQ ID NO: 1-276, or 553-772, antisense nucleic acids of the invention can be designed 
according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic 
acid molecule can be complementary to the entire coding region of an mRNA, but more 
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preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding 
region of an mRNA. For example, the antisense oligonucleotide can be complementary to 
the region surrounding the translation start site of an mRNA. An antisense oligonucleotide 
can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An 
5 antisense nucleic acid of the invention can be constructed using chemical synthesis or 
enzymatic ligation reactions using procedures known in the art. For example, an antisense 
nucleic acid (eg., an antisense oligonucleotide) can be chemically synthesized using 
naturally occurring nucleotides or variously modified nucleotides designed to increase the 
biological stability of the molecules or to increase the physical stability of the duplex formed 

10 between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and 
acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic 
acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, 
xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5- 

15 carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, 

dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 

1- methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3- 
methylcytosine, 5-methylcytosine, N6-adenine, 7-methyl guanine, 5- 
methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 

20 S-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, 

uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl- 

2- thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, 
uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, 
(acp3)w, and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced 

25 biologically using an expression vector into which a nucleic acid has been subcloned in an 
antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an 
antisense orientation to a target nucleic acid of interest, described further in the following 
subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
30 subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of 
the protein, e.g. y by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the 
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case of an anti sense nucleic acid molecule that binds to DNA duplexes, through specific 
interactions in the major groove of the double helix. An example of a route of 
administration of antisense nucleic acid molecules of the invention includes direct injection 
at a tissue site. Alternatively, antisense nucleic acid molecules can be modified to target 
5 selected cells and then administered systemically. For example, for systemic administration, 
antisense molecules can be modified such that they specifically bind to receptors or antigens 
expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to 
peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic 
acid molecules can also be delivered to cells using the vectors described herein. To achieve 

10 sufficient intracellular concentrations of antisense molecules, vector constructs in which the 
antisense nucleic acid molecule is placed under the control of a strong pol II or pol III 
promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
a-anomeric nucleic acid molecule. An ct-anomeric nucleic acid molecule forms specific 

1 5 double-stranded hybrids with complementary RNA in which, contrary to the usual a-units, 
the strands run parallel to each other (Gaultier et aL (1987) Nucleic Acids Res 1 5: 
6625-6641). The antisense nucleic acid molecule can also comprise a 
2 , -o-methylribonucleotide (Inoue et aL ( 1 987) Nucleic Acids Res 1 5 : 6 1 3 1 -6 1 48) or a 
chimeric RNA -DNA analogue (Inoue et al. (1987) FEBSLett 215: 327-330). 

20 

4.4 RIBOZYMES AND PNA MOIETIES 

Li still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of 
cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a 

25 complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described in 
Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically cleave 
mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having specificity 
for a nucleic acid of the invention can be designed based upon the nucleotide sequence of a 
DNA disclosed herein (i.e., SEQ ID NO: 1-276, or 553-772). For example, a derivative of 

30 Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the 

active site is complementary to the nucleotide sequence to be cleaved in a mRNA. See, e.g., 
Cech et al. U.S. Pat. No. 4,987,071; and Cech et aL U.S. Pat. No. 5,1 16,742. Alternatively, 
mRNA of the invention can be used to select a catalytic RNA having a specific ribonuclease 
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activity from a pool of RNA molecules. See, e.g., Bartel et aL, (1993) Science 
261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple 
5 helical structures that prevent transcription of the gene in target cells. See generally, Helene. 
(1991) Anticancer Drug Des. 6: 569-84; Helene. etal. (1992) Ann. NY. Acad. ScL 
660:27-36; and Maher (1992) Bioassays 14: 807-15. 

In various embodiments, the nucleic acids of the invention can be modified at the 
base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, 

1 0 hybridization, or solubility of the molecule. For example, the deoxyribose phosphate 

backbone of the nucleic acids can be modified to generate peptide nucleic acids (see Hyrup 
et aL (1996) BioorgMed Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" 
or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the deoxyribose 
phosphate backbone is replaced by a pseudopeptide backbone and only the four natural 

15 nucleobases are retained. The neutral backbone of PNAs has been shown to allow for 
specific hybridization to DNA and RNA under conditions of low ionic strength. The 
synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis 
protocols as described in Hyrup et aL (1996) above; Perry-OTCeefe et aL (1996) PNAS 93: 
14670-675. 

20 PNAs of the invention can be used in therapeutic and diagnostic applications. For 

example, PNAs can be used as antisense or antigene agents for sequence-specific modulation 
of gene expression by, e.g., inducing transcription or translation arrest or inhibiting 
replication. PNAs of the invention can also be used, e.g., in the analysis of single base pair 
mutations in a gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes 

25 when used in combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); 
or as probes or primers for DNA sequence and hybridization (Hyrup et aL (1996), above; 
Perry-OTCeefe ( 1 996), above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance 
their stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by 

30 the formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA 
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portion while the PNA portion would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms of 
base stacking, number of bonds between the nucleobases, and orientation (Hyrup (1996) 
above). The synthesis of PNA-DNA chimeras can be performed as described in Hyrup 
5 (1996) above and Finn etal. (1996) Nucl Acids Res 24: 3357-63. For example, a DNA chain 
can be synthesized on a solid support using standard phosphoramidite coupling chemistry, 
and modified nucleoside analogs, e.g., 5 , -(4-methoxytrityl)amino-5 , -deoxy-thymidine 
phosphoramidite, can be used between the PNA and the 5* end of DNA (Mag et al. (1989) 
Nucl Acid Res 17: 5973-88). PNA monomers are then coupled in a stepwise manner to 
10 produce a chimeric molecule with a 5* PNA segment and a 3' DNA segment (Finn et al. 
(1996) above). Alternatively, chimeric molecules can be synthesized with a 5' DNA 
segment and a 3 f PNA segment. See, Petersen et al. (1975) Bioorg Med Chem Lett 5: 
1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such 
15 as peptides {e.g., for targeting host cell receptors in vivo), or agents facilitating transport 

across the cell membrane (see, e.g., Letsinger et al., 1989, Proc. Natl. Acad. Sci. U.S.A. 

86:6553-6556; Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication 

No. W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). 

In addition, oligonucleotides can be modified with hybridization triggered cleavage agents 
20 (See, e.g. y Krol et al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., 

Zon, 1988, Pharm. Res. 5: 539-549). To this end, the oligonucleotide may be conjugated to 

another molecule, e.g., a peptide, a hybridization triggered cross-linking agent, a transport 

agent, a hybridization-triggered cleavage agent, etc. 

25 4.5 HOSTS 

The present invention further provides host cells genetically engineered to contain 
the polynucleotides of the invention. For example, such host cells may contain nucleic acids 
of the invention introduced into the host cell using known transformation, transfection or 
infection methods. The present invention still further provides host cells genetically 

30 engineered to express the polynucleotides of the invention, wherein such polynucleotides are 
in operative association with a regulatory sequence heterologous to the host cell which 
drives expression of the polynucleotides in the cell. 
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Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in 
whole or in part, the naturally occurring promoter with all or part of a heterologous promoter 
5 so that the cells express the polypeptide at higher levels. The heterologous promoter is 
inserted in such a manner that it is operatively linked to the encoding sequences. See, for 
example, PCT International Publication No. WO94/12650, PCT International Publication 
No. WO92/20808, and PCT International Publication No. WO91/09955. It is also 
contemplated that, in addition to heterologous promoter DNA, amplifiable marker DNA 

10 (e.g., ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate 
synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be 
inserted along with the heterologous promoter DNA. If linked to the coding sequence, 
amplification of the marker DNA by standard selection methods results in co-amplification 
of the desired protein coding sequences in the cells. 

15 The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation 
(Davis, L. et al., Basic Methods in Molecular Biology (1 986)). The host cells containing one 

20 of the polynucleotides of the invention, can be used in conventional manners to produce the 
gene product encoded by the isolated fragment (in the case of an ORF) or can be used to 
produce a heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the 
present invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, 

25 Cv-1 cell, COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and 
B. subtilis. The most preferred cells are those which do not normally express the particular 
polypeptide or protein or which expresses the polypeptide or protein at low natural level. 
Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under 
the control of appropriate promoters. Cell-free translation systems can also be employed to 

30 produce such proteins using RNAs derived from the DNA constructs of the present 
invention. Appropriate cloning and expression vectors for use with prokaryotic and 
eukaryotic hosts are described by Sambrook, et al, in Molecular Cloning: A Laboratory 
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hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 lines 
5 of monkey kidney fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines 
capable of expressing a compatible vector are, for example, the CI 27, monkey COS cells, 
Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal A431 cells, 
human Colo205 cells, 3T3 cells, CV-1 cells, other transformed primate cell lines, normal 
diploid cells, cell strains derived from in vitro culture of primary tissue, primary explants, 

10 HeLa cells, mouse L cells, BHK, HI^60, U937, HaK or Jurkat cells. Mammalian expression 
vectors will comprise an origin of replication, a suitable promoter and also any necessary 
ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional 
termination sequences, and 5' flanking nontranscribed sequences. DNA sequences derived 
from the SV40 viral genome, for example, SV40 origin, early promoter, enhancer, splice, 

1 5 and polyadenylation sites may be used to provide the required nontranscribed genetic 

elements. Recombinant polypeptides and proteins produced in bacterial culture are usually 
isolated by initial extraction from cell pellets, followed by one or more salting-out, aqueous 
ion exchange or size exclusion chromatography steps. Protein refolding steps can be used, 
as necessary, in completing configuration of the mature protein. Finally, high performance 

20 liquid chromatography (HPLC) can be employed for final purification steps. Microbial cells 
employed in expression of proteins can be disrupted by any convenient method, including 
freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as 
yeast or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

25 Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, 
or any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli t Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or 
bacteria, it may be necessary to modify the protein produced therein, for example by 

30 phosphorylation or glycosylation of the appropriate sites, in order to obtain the functional 
protein. Such covalent attachments may be accomplished using known chemical or 
enzymatic methods. 
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In another embodiment of the present invention, cells and tissues may be engineered 
to express an endogenous gene comprising the polynucleotides of the invention under the 
control of inducible regulatory elements, in which case the regulatory sequences of the 
endogenous gene may be replaced by homologous recombination. As described herein, gene 
5 targeting can be used to replace a gene's existing regulatory region with a regulatory 
sequence isolated from a different gene or a novel regulatory sequence synthesized by 
genetic engineering methods. Such regulatory sequences may be comprised of promoters, 
enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional 
initiation sites, and regulatory protein binding sites or combinations of said sequences. 

10 Alternatively, sequences which affect the structure or stability of the RNA or protein 
produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequence include polyadenylation signals, mRNA stability elements, splice sites, leader 
sequences for enhancing or modifying transport or secretion properties of the protein, or 
other sequences which alter or improve the function or stability of protein or RNA 

15 molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple 
deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 

20 element. Alternatively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different 
cell-type specificity than the naturally occurring elements. Here, the naturally occurring 
sequences are deleted and new sequences are added. In all cases, the identification of the 
targeting event may be facilitated by the use of one or more selectable marker genes that are 

25 contiguous with the targeting DNA, allowing for the selection of cells in which the 

exogenous DNA has integrated into the host cell genome. The identification of the targeting 
event may also be facilitated by the use of one or more marker genes exhibiting the property 
of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting 

30 sequence, and such that a correct homologous recombination event with sequences in the 
host cell genome does not result in the stable integration of the negatively selectable marker. 
Markers useful for this purpose include the Herpes Simplex Virus thymidine kinase (TK) 
gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene. 
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The gene targeting or gene activation techniques which can be used in accordance 
with this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 
to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 
5 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by 
reference herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a 

10 polypeptide comprising: the amino acid sequences set forth as any one of SEQ ID NO: 277- 
552, or 773-992 or an amino acid sequence encoded by any one of the nucleotide sequences 
SEQ ID NO: 1-276, or 553-772 or the corresponding full length or mature protein. 
Polypeptides of the invention also include polypeptides preferably with biological or 
immunological activity that are encoded by: (a) a polynucleotide having any one of the 

1 5 nucleotide sequences set forth in SEQ ID NO: 1 -276, or 553-772 or (b) polynucleotides 

encoding any one of the amino acid sequences set forth as SEQ ID NO: 277-552, or 773-992 
or (c) polynucleotides that hybridize to the complement of the polynucleotides of either (a) 
or (b) under stringent hybridization conditions. The invention also provides biologically 
active or immunologically active variants of any of the amino acid sequences set forth as 

20 SEQ ID NO: 277-552, or 773-992 or the corresponding full length or mature protein; and 
"substantial equivalents" thereof (e.g., with at least about 65%, at least about 70%, at least 
about 75%, at least about 80%, at least about 85%, 86%, 87%, 88%, 89%, at least about 
90%, 91%, 92%, 93%, 94%, typically at least about 95%, 96%, 97%, more typically at least 
about 98%, or most typically at least about 99% amino acid identity) that retain biological 

25 activity. Polypeptides encoded by allelic variants may have a similar, increased, or 

decreased activity compared to polypeptides comprising SEQ ED NO: 277-552, or 773-992. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein 
may be in linear form or they may be cyclized using known methods, for example, as 

30 described in H. U. Saragovi, et al., Bio/Technology 1 0, 773-778 (1992) and in R. S. 
McDowell, et al., J. Amer. Chem. Soc. 1 14, 9245-9253 (1992), both of which are 
incorporated herein by reference. Such fragments may be fused to carrier molecules such as 



WO 03/025148 



PCT/US02/29964 



33 

immunoglobulins for many purposes, including increasing the valency of protein binding 
sites. Fragments are also identified in Tables 3, 4A, 4B, 5, 6, or 8. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein 
5 coding sequence is identified in the sequence listing by translation of the disclosed 

nucleotide sequences. The predicted signal sequence is set forth in Table 6. The mature 
form of such protein may be obtained and confirmed by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell and sequencing of the cleaved 
product. One of skill in the art will recognize that the actual cleavage site may be different 

10 than that predicted in Table 6. The sequence of the mature form of the protein is also 

determinable from the amino acid sequence of the full-length form. Where proteins of the 
present invention are membrane bound, soluble forms of the proteins are also provided. In 
such forms, part or all of the regions causing the proteins to be membrane bound are deleted 
so that the proteins are fully secreted from the cell in which they are expressed (See, e.g., 

15 Sakal et al., Prep. Biochem. Biotechnol. (2000), 30(2), pp. 107-23, incorporated herein by 
reference). 

Protein compositions of the present invention may further comprise an acceptable 
carrier, such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic 

20 acid fragments of the present invention or by degenerate variants of the nucleic acid 
fragments of the present invention. By "degenerate variant" is intended nucleotide 
fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) 
by nucleotide sequence but, due to the degeneracy of the genetic code, encode an identical 
polypeptide sequence. Preferred nucleic acid fragments of the present invention are the 

25 ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino 
acid sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or 

30 tertiary structural and/or conformational characteristics with proteins may possess biological 
properties in common therewith, including protein activity. This technique is particularly 
useful in producing small peptides and fragments of larger polypeptides. Fragments are 
useful, for example, in generating antibodies against the native polypeptide. Thus, they may 
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be employed as biologically active or immunological substitutes for natural, purified 
proteins in screening of therapeutic compounds and in immunological processes for the 
development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified 
5 from cells which have been altered to express the desired polypeptide or protein. As used 
herein, a cell is said to be altered to express a desired polypeptide or protein when the cell, 
through genetic manipulation, is made to produce a polypeptide or protein which it normally 
does not produce or which the cell normally produces at a lower level. One skilled in the art 
can readily adapt procedures for introducing and expressing either recombinant or synthetic 

1 0 sequences into eukaryotic or prokaryotic cells in order to generate a cell which produces one 
of the polypeptides or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising 
growing a culture of host cells of the invention in a suitable culture medium, and purifying 
the protein from the cells or the culture in which the cells are grown. For example, the 

1 5 methods of the invention include a process for producing a polypeptide in which a host cell 
containing a suitable expression vector that includes a polynucleotide of the invention is 
cultured under conditions that allow expression of the encoded polypeptide. The 
polypeptide can be recovered from the culture, conveniently from the culture medium, or 
from a lysate prepared from the host cells and further purified. Preferred embodiments 

20 include those in which the protein produced by such process is a full length or mature form 
of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells 
which naturally produce the polypeptide or protein. One skilled in the art can readily follow 
known methods for isolating polypeptides and proteins in order to obtain one of the isolated 

25 polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange 
chromatography, and immuno-affinity chromatography. See, e.g., Scopes, Protein 
Purification: Principles and Practice, Springer- Verlag (1994); Sambrook, et al., in 
Molecular Cloning: A Laboratory Manual; Ausubel et al., Current Protocols in Molecular 

30 Biology. Polypeptide fragments that retain biological/immunological activity include 

fragments comprising greater than about 100 amino acids, or greater than about 200 amino 
acids, and fragments that encode specific protein domains. 
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The purified polypeptides can be used in in vitro binding assays which are well 
known in the art to identify molecules which bind to the polypeptides. These molecules 
include but are not limited to, for e.g., small molecules, molecules from combinatorial 
libraries, antibodies or other proteins. The molecules identified in the binding assay are then 
5 tested for antagonist or agonist activity in in vivo tissue culture or animal models that are 
well known in the art. In brief, the molecules are titrated into a plurality of cell cultures or 
animals and then tested for either cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the 
peptides may be complexed with toxins, e.g., ricin or cholera, or with other compounds that 
10 are toxic to cells. The toxin-binding molecule complex is then targeted to a tumor or other 
cell by the specificity of the binding molecule for SEQ ID NO: 277-552, or 773-992. 

The protein of the invention may also be expressed as a product of transgenic 
animals, e.g.. as a component of the milk of transgenic cows, goats, pigs, or sheep which are 
characterized by somatic or germ cells containing a nucleotide sequence encoding the 
15 protein. 

The proteins provided herein also include proteins characterized by amino acid 
sequences similar to those of purified proteins but into which modification are naturally 
provided or deliberately engineered. For example, modifications, in the peptide or DNA 
sequence, can be made by those skilled in the art using known techniques. Modifications of 

20 interest in the protein sequences may include the alteration, substitution, replacement, 

insertion or deletion of a selected amino acid residue in the coding sequence. For example, 
one or more of the cysteine residues may be deleted or replaced with another amino acid to 
alter the conformation of the molecule. Techniques for such alteration, substitution, 
replacement, insertion or deletion are well known to those skilled in the art (see, e.g., U.S. 

25 Pat. No. 4,51 8,584). Preferably, such alteration, substitution, replacement, insertion or 

deletion retains the desired activity of the protein. Regions of the protein that are important 
for the protein function can be determined by various methods known in the art including the 
alanine-scanning method which involved systematic substitution of single or strings of 
amino acids with alanine, followed by testing the resulting alanine-containing variant for 

30 biological activity. This type of analysis determines the importance of the substituted amino 
acid(s) in biological activity. Regions of the protein that are important for protein function 
may be determined by the eMATRIX program. 
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Other fragments and derivatives of the sequences of proteins which would be 
expected to retain protein activity in whole or in part and are useful for screening or other 
immunological methodologies may also be easily made by those skilled in the art given the 
disclosures herein. Such modifications are encompassed by the present invention. 
5 The protein may also be produced by operably linking the isolated polynucleotide of 

the invention to suitable control sequences in one or more insect expression vectors, and 
employing an insect expression system. Materials and methods for baculovirus/insect cell 
expression systems are commercially available in kit form from, e.g., Invitrogen, San Diego, 
Calif, U.S.A. (the MaxBat™ kit), and such methods are well known in the art, as described 

10 in Summers and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), 
incorporated herein by reference. As used herein, an insect cell capable of expressing a 
polynucleotide of the present invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells 
under culture conditions suitable to express the recombinant protein. The resulting 

15 expressed protein may then be purified from such culture (i.e, from culture medium or cell 
extracts) using known purification processes, such as gel filtration and ion exchange 
chromatography. The purification of the protein may also include an affinity column 
containing agents which will bind to the protein; one or more column steps over such affinity 
resins as concanavalin A-agarose, heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; 

20 one or more steps involving hydrophobic interaction chromatography using such resins as 
phenyl ether, butyl ether, or propyl ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as 

25 a His tag. Kits for expression and purification of such fusion proteins are commercially 
available from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and 
Invitrogen, respectively. The protein can also be tagged with an epitope and subsequently 
purified by using a specific antibody directed to such epitope. One such epitope ("FLAG®") 
is commercially available from Kodak (New Haven, Conn.). 

30 Finally, one or more reverse-phase high performance liquid chromatography (RP- 

HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant 
methyl or other aliphatic groups, can be employed to further purify the protein. Some or all 
of the foregoing purification steps, in various combinations, can also be employed to provide 
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a substantially homogeneous isolated recombinant protein. The protein thus purified is 
substantially free of other mammalian proteins and is defined in accordance with the present 
invention as an "isolated protein." 

The polypeptides of the invention include analogs (variants). This embraces 
5 fragments, as well as peptides in which one or more amino acids has been deleted, inserted, 
or substituted. Also, analogs of the polypeptides of the invention embrace fusions of the 
polypeptides or modifications of the polypeptides of the invention, wherein the polypeptide 
or analog is fused to another moiety or moieties, e.g., targeting moiety or another therapeutic 
agent. Such analogs may exhibit improved properties such as activity and/or stability. 

10 Examples of moieties which may be fused to the polypeptide or an analog include, for 

example, targeting moieties which provide for the delivery of polypeptide to pancreatic cells, 
e.g., antibodies to pancreatic cells, antibodies to immune cells such as T-cells, monocytes, 
dendritic cells, granulocytes, etc., as well as receptor and ligands expressed on pancreatic or 
immune cells. Other moieties which may be fused to the polypeptide include therapeutic 

15 agents which are used for treatment, for example, immunosuppressive drugs such as 

cyclosporin, SK506, azathioprine, CD3 antibodies and steroids. Also, polypeptides may be 
fused to immune modulators, and other cytokines such as alpha or beta interferon. 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE 
20 IDENTITY AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between 
the sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 

25 University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, 
S.F. et al, J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic 
Acids Res. vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu 
et al., J. Comp. Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif 
software (Nevill-Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by 

30 reference), Pfam software (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 
(1998), herein incorporated by reference) and the Kyte-Doolittle hydrophobocity prediction 
algorithm (J. Mol Biol, 157, pp. 105-31 (1982), the GeneAtlas software (Molecular 
Simulations Inc. (MSI), San Diego, CA) (Sanchez and Sali (1998) Proc. Natl. Acad. Sci., 95, 
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13597-13602; Kitson DH et a), (2000) "Remote homology detection using structural 
modeling - an evaluation" Submitted; Fischer and Eisenberg (1996) Protein Sci. 5, 947- 
955), Neural Network SignalP VI .1 program (from Center for Biological Sequence 
Analysis, The Technical University of Denmark) incorporated herein by reference). 
5 Polypeptide sequences were examined by a proprietary algorithm, SeqLoc that separates the 
proteins into three sets of locales: intracellular, membrane, or secreted. This prediction is 
based upon three characteristics of each polypeptide, including percentage of cysteine 
residues, Kyte-Doolittle scores for the first 20 amino acids of each protein, and Kyte- 
Doolittle scores to calculate the longest hydrophobic stretch of the said protein. Values of 

10 predicted proteins are compared against the values from a set of 592 proteins of known 

cellular localization from the Swissprot database (http://www.expasv.ch/sprot) . Predictions 
are based upon the maximum likelihood estimation. 

Presence of transmembrane region(s) was detected using the TMpred program 
(htlp://wwwxh.embnetor^software/TMPRED form.htroO . 

1 5 The BLAST programs are publicly available from the National Center for 

Biotechnology Information (NCBI) and other sources (BLAST Manual, Altschul, S., et al. 
NCBI NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. Biol. 215:403-410 
(1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

20 The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 

25 invention. . In another embodiment, a fusion protein comprises at least two biologically 
active portions of a protein according to the invention. Within the fusion protein, the term 
"operatively linked" is intended to indicate that the polypeptide according to the invention 
and the other polypeptide are fused in-frame to each other. The polypeptide can be fused to 
the N-terminus or C-terminus, or to the middle. 

30 For example, in one embodiment a fusion protein comprises a polypeptide according 

to the invention operably linked to the extracellular domain of a second protein. 
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In another embodiment, the fusion protein is a GST-fusion protein in which the 
polypeptide sequences of the invention are fused to the C-terminus of the GST (i.e., 
glutathione S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in 
5 which the polypeptide sequences according to the invention comprise one or more domains 
fused to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in 

10 vivo. The immunoglobulin fusion proteins can be used to affect the bioavailability of a 

cognate ligand. Inhibition of the ligand/protein interaction may be useful therapeutically for 
both the treatment of proliferative and differentiative disorders, e.g., cancer as well as 
modulating (e.g., promoting or inhibiting) cell survival. Moreover, the immunoglobulin 
fusion proteins of the invention can be used as immunogens to produce antibodies in a 

15 subject, to purify ligands, and in screening assays to identify molecules that inhibit the 
interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard 
recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 

20 techniques, e.g., by employing blunt-ended or stagger-ended termini for ligation, restriction 
enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as 
appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic 
ligation. In another embodiment, the fusion gene can be synthesized by conventional 
techniques including automated DNA synthesizers. Alternatively, PCR amplification of 

25 gene fragments can be carried out using anchor primers that give rise to complementary 
overhangs between two consecutive gene fragments that can subsequently be annealed and 
reamplified to generate a chimeric gene sequence (see, for example, Ausubel et al. (eds.) 
Current Protocols in Molecular Biology, John Wiley & Sons, 1992). Moreover, 
many expression vectors are commercially available that already encode a fusion moiety 

30 (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the invention can be 
cloned into such an expression vector such that the fusion moiety is linked in-frame to the 
protein of the invention. 
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4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides 
5 of the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more 
particularly viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo 
by use of physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for 
example, Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For 

10 additional reviews of gene therapy technology see Friedmann, Science, 244: 1275-1281 

(1989); Verma, Scientific American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). 
Introduction of any one of the nucleotides of the present invention or a gene encoding the 
polypeptides of the present invention can also be accomplished with extrachromosomal 
substrates (transient expression) or artificial chromosomes (stable expression). Cells may 

1 5 also be cultured ex vivo in the presence of proteins of the present invention in order to 

proliferate or to produce a desired effect on or activity in such cells. Treated cells can then 
be introduced in vivo for therapeutic purposes. Alternatively, it is contemplated that in other 
human disease states, preventing the expression of or inhibiting the activity of polypeptides 
of the invention will be useful in treating the disease states. It is contemplated that antisense 

20 therapy or gene therapy could be applied to negatively regulate the expression of 
polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated 
RNA sequences, by methods known in the art. Further, the polypeptides of the present 

25 invention can be inhibited by using targeted deletion methods, or the insertion of a negative 
regulatory element such as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to 
express the polynucleotides of the invention, wherein such polynucleotides are in operative 
association with a regulatory sequence heterologous to the host cell which drives expression of 

30 the polynucleotides in the cell. These methods can be used to increase or decrease the 
expression of the polynucleotides of the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of 
cells to permit, increase, or decrease, expression of endogenous polypeptide. Cells can be 
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modified (e.g., by homologous recombination) to provide increased polypeptide expression by 
replacing, in whole or in part, the naturally occurring promoter with all or part of a heterologous 
promoter so that the cells express the protein at higher levels. The heterologous promoter is 
inserted in such a manner that it is operatively linked to the desired protein encoding sequences. 
5 See, for example, PCT International Publication No. WO 94/12650, PCT International 

Publication No. WO 92/20808, and PCT International Publication No. WO 91/09955. It is also 
contemplated that, in addition to heterologous promoter DNA, amplifiable marker DNA (e.g., 
ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate synthase, 
aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be inserted along with 

1 0 the heterologous promoter DNA. If linked to the desired protein coding sequence, 

amplification of the marker DNA by standard selection methods results in co-amplification of 
the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control 

1 5 of inducible regulatory elements, in which case the regulatory sequences of the endogenous 
gene may be replaced by homologous recombination. As described herein, gene targeting can 
be used to replace a gene's existing regulatory region with a regulatory sequence isolated from 
a different gene or a novel regulatory sequence synthesized by genetic engineering methods. 
Such regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment 

20 regions, negative regulatory elements, transcriptional initiation sites, regulatory protein binding 
sites or combinations of said sequences. Alternatively, sequences which affect the structure or 
stability of the RNA or protein produced may be replaced, removed, added, or otherwise 
modified by targeting. These sequences include polyadenylation signals, mRNA stability 
elements, splice sites, leader sequences for enhancing or modifying transport or secretion 

25 properties of the protein, or other sequences which alter or improve the function or stability of 
protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple 

30 deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 
element. Alternatively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different cell-type 
specificity than the naturally occurring elements. Here, the naturally occurring sequences are 
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deleted and new sequences are added. In all cases, the identification of the targeting event may 
be facilitated by the use of one or more selectable marker genes that are contiguous with the 
targeting DNA, allowing for the selection of cells in which the exogenous DNA has integrated 
into the cell genome. The identification of the targeting event may also be facilitated by the use 
5 of one or more marker genes exhibiting the property of negative selection, such that the 
negatively selectable marker is linked to the exogenous DNA, but configured such that the 
negatively selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 

10 Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et aL; International Application No. 

1 5 PCT/US92/09627 (WO93/09222) by Selden et aL; and International Application No. 
PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by 
reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

20 In preferred methods to determine biological functions of the polypeptides of the 

invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 

25 Animals in which an endogenous gene has been inactivated by homologous recombination 
are referred to as "knockout" animals. Knockout animals, preferably non-human mammals, 
can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. 
Transgenic animals are useful to determine the roles polypeptides of the invention play in 
biological processes, and preferably in disease states. Transgenic animals are useful as model 

30 systems to identify compounds that modulate lipid metabolism. Transgenic animals, 

preferably non-human mammals, are produced using methods as described in U.S. Patent No 
5,489,743 and PCT Publication No. W094/28122, incorporated herein by reference. 
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Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of 
expression of the polypeptides of the invention. Inactivation can be carried out using 
homologous recombination methods described above. Activation can be achieved by 
5 supplementing or even replacing the homologous promoter to provide for increased protein 
expression. The homologous promoter can be supplemented by insertion of one or more 
heterologous enhancer elements known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to 

1 0 express polypeptides of the invention or that express a variant polypeptide. Such animals are 
useful as models for studying the in vivo activities of polypeptide as well as for studying 
modulators of the polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

15 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination 
are referred to as "knockout" animals. Knockout animals, preferably non-human mammals, 

20 can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. 
Transgenic animals are useful to determine the roles polypeptides of the invention play in 
biological processes, and preferably in disease states. Transgenic animals are useful as model 
systems to identify compounds that modulate lipid metabolism. Transgenic animals, 
preferably non-human mammals, are produced using methods as described in U.S. Patent No 

25 5,489,743 and PCT Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or 

30 even replacing the homologous promoter to provide for increased protein expression. The 
homologous promoter can be supplemented by insertion of one or more heterologous 
enhancer elements known to confer promoter activation in a particular tissue. 
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4.10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit one 
or more of the uses or biological activities (including those associated with assays cited 
herein) identified herein. Uses or activities described for proteins of the present invention 
5 may be provided by administration or use of such proteins or of polynucleotides encoding 
such proteins (such as, for example, in gene therapies or vectors suitable for introduction of 
DNA). The mechanism underlying the particular condition or pathology will dictate whether 
the polypeptides of the invention, the polynucleotides of the invention or modulators 
(activators or inhibitors) thereof would be beneficial to the subject in need of treatment. 

10 Thus, '"therapeutic compositions of the invention" include compositions comprising isolated 
polynucleotides (including recombinant DNA molecules, cloned genes and degenerate 
variants thereof) or polypeptides of the invention (including full length protein, mature 
protein and truncations or domains thereof), or compounds and other substances that 
modulate the overall activity of the target gene products, either at the level of target 

15 gene/protein expression or target protein activity. Such modulators include polypeptides, 
analogs, (variants), including fragments and fusion proteins, antibodies and other binding 
proteins; chemical compounds that directly or indirectly activate or inhibit the polypeptides 
of the invention (identified, e.g., via drug screening assays as described herein); antisense 
polynucleotides and polynucleotides suitable for triple helix formation; and in particular 

20 antibodies or other binding partners that specifically recognize one or more epitopes of the 
polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular 
activation or in one of the other physiological pathways described herein. 

25 4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutively or at a particular stage 
30 of tissue differentiation or development or in disease states); as molecular weight markers on 
gels; as chromosome markers or tags (when labeled) to identify chromosomes or to map 
related gene positions; to compare with endogenous DNA sequences in patients to identify 
potential genetic disorders; as probes to hybridize and thus discover novel, related DNA 
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sequences; as a source of information to derive PCR primers for genetic fingerprinting; as a 
probe to "subtract-out" known sequences in the process of discovering other novel 
polynucleotides; for selecting and making oligomers for attachment to a "gene chip" or other 
support, including for examination of expression patterns; to raise anti-protein antibodies 
5 using DNA immunization techniques; and as an antigen to raise anti-DNA antibodies or 
elicit another immune response. Where the polynucleotide encodes a protein which binds or 
potentially binds to another protein (such as, for example, in a receptor-ligand interaction), 
the polynucleotide can also be used in interaction trap assays (such as, for example, that 
described in Gyuris et al., Cell 75:791-803 (1993)) to identify polynucleotides encoding the 

10 other protein with which binding occurs or to identify inhibitors of the binding interaction. 

The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including 
the labeled reagent) in assays designed to quantitatively determine levels of the protein (or 

1 5 its receptor) in biological fluids; as markers for tissues in which the corresponding 

polypeptide is preferentially expressed (either constitutively or at a particular stage of tissue 
differentiation or development or in a disease state); and, of course, to isolate correlative 
receptors or ligands. Proteins involved in these binding interactions can also be used to 
screen for peptide or small molecule inhibitors or agonists of the binding interaction. 

20 Any or all of these research utilities are capable of being developed into reagent 

grade or kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the 
art. References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. 

25 Fritsch and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular 
Cloning Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as 
30 nutritional sources or supplements. Such uses include without limitation use as a protein or 
amino acid supplement, use as a carbon source, use as a nitrogen source and use as a source of 
carbohydrate. In such cases the polypeptide or polynucleotide of the invention can be added to 
the feed of a particular organism or can be administered as a separate solid or liquid 
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preparation, such as in the form of powder, pills, solutions, suspensions or capsules. In the case 
of microorganisms, the polypeptide or polynucleotide of the invention can be added to the 
medium in or on which the microorganism is cultured. 

5 4.10 J CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 

ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or 
inhibiting) activity or may induce production of other cytokines in certain cell populations. 

10 A polynucleotide of the invention can encode a polypeptide exhibiting such attributes. 
Many protein factors discovered to date, including all known cytokines, have exhibited 
activity in one or more factor-dependent cell proliferation assays, and hence the assays serve 
as a convenient confirmation of cytokine activity. The activity of therapeutic compositions 
of the present invention is evidenced by any one of a number of routine factor dependent cell 

15 proliferation assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, 
B9/1 1, BaF3, MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1 , 
Mo7e, CMK, HUVEC, and Caco. Therapeutic compositions of the invention can be used in 
the following: 

Assays for T-cell or thymocyte proliferation include without limitation those 
20 described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 

Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 

Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; 

Bertagnolli et al, J. Immunol. 145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 
25 133:327-341, 1991; Bertagnolli, et al, I. Immunol. 149:3778-3783, 1992; Bowman et al., I. 

Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells 

or thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 

Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
30 eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of 

mouse and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. 

e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 
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Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine 
Interleukin 2 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current 
Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and 
5 Sons, Toronto. 1991; deVries et al. f J. Exp. Med. 173:1205-121 1, 1991; Moreau et al., 

Nature 336:690-692, 1988; Greenberger et al, Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 
1983; Measurement of mouse and human interleukin 6— Nordan, R. In Current Protocols in 
Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; 
Smith et al., Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; Measurement of human 

10 Interleukin 1 1 -Bennett, F., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols 
in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John Wiley and Sons, Toronto. 1991; 
Measurement of mouse and human Interleukin 9— Ciarletta, A., Giannotti, J., Clark, S. C. 
and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.13.1, 
John Wiley and Sons, Toronto. 1 99 1 . 

1 5 Assays for T-cell clone responses to antigens (which will identify, among others, 

proteins that affect APC-T cell interactions as well as direct T-cell effects by measuring 
proliferation and cytokine production) include, without limitation, those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, 
E. M. Shevach, W Strober, Pub. Greene Publishing Associates and Wiley-Interscience 

20 (Chapter 3, In Vitro assays for Mouse Lymphocyte Function; Chapter 6, Cytokines and their 
cellular receptors; Chapter 7, Immunologic studies in Humans); Weinberger et al., Proc. 
Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al., Eur. J. Immun. 1 1:405-41 1, 
1981; Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988. 

25 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity 
and be involved in the proliferation, differentiation and survival of pluripotent and totipotent 
stem cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells 
30 and/or germ line stem cells. Administration of the polypeptide of the invention to stem cells 
in vivo or ex vivo is expected to maintain and expand cell populations in a totipotential or 
pluripotential state which would be useful for re-engineering damaged or diseased tissues, 
transplantation, manufacture of bio-pharmaceuticals and the development of bio-sensors. 
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The ability to produce large quantities of human cells has important working applications for 
the production of human proteins which currently must be obtained from non-human sources 
or donors, implantation of cells to treat diseases such as Parkinson's, Alzheimer's and other 
neurodegenerative diseases; tissues for grafting such as bone marrow, skin, cartilage, 
5 tendons, bone, muscle (including cardiac muscle), blood vessels, cornea, neural cells, 
gastrointestinal cells and others; and organs for transplantation such as kidney, liver, 
pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines 
may be administered in combination with the polypeptide of the invention to achieve the 

10 desired effect, including any of the growth factors listed herein, other stem cell maintenance 
factors, and specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), 
Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL- 
6, macrophage inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, 
thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth factor (PDGF), 

1 5 neural growth factors and basic fibroblast growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion 
of these cells in culture will facilitate the production of large quantities of mature cells. 
Techniques for culturing stem cells are known in the art and administration of polypeptides 
of the invention, optionally with other growth factors and/or cytokines, is expected to 

20 enhance the survival and proliferation of the stem cell populations. This can be 

accomplished by direct administration of the polypeptide of the invention to the culture 
medium. Alternatively, stroma cells transfected with a polynucleotide that encodes for the 
polypeptide of the invention can be used as a feeder layer for the stem cell populations in 
culture or in vivo. Stromal support cells for feeder layers may include embryonic bone 

25 marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or cultured embryonic 
fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to 
induce autocrine expression of the polypeptide of the invention. This will allow for 
generation of undifferentiated totipotential/pluripotential stem cell lines that are useful as is 

30 or that can then be differentiated into the desired mature cell types. These stable cell lines 
can also serve as a source of undifferentiated totipotential/pluripotential mRNA to create 
cDNA libraries and templates for polymerase chain reaction experiments. These studies 
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would allow for the isolation and identification of differentially expressed genes in stem cell 
populations that regulate stem cell proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present 
5 invention may be used to manipulate stem cells in culture to give rise to neuroepithelial cells 
that can be used to augment or replace cells damaged by illness, autoimmune disease, 
accidental damage or genetic disorders. The polypeptide of the invention may be useful for 
inducing the proliferation of neural cells and for the regeneration of nerve and brain tissue, 
i.e. for the treatment of central and peripheral nervous system diseases and neuropathies, as 
10 well as mechanical and traumatic disorders which involve degeneration, death or trauma to 
neural cells or nerve tissue. In addition, the expanded stem cell populations can also be 
genetically altered for gene therapy purposes and to decrease host rejection of replacement 
tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 

15 manipulated to achieve controlled differentiation of the stem cells into more differentiated 

cell types. A broadly applicable method of obtaining pure populations of a specific 
» 

differentiated cell type from undifferentiated stem cell populations involves the use of a cell- 
type specific promoter driving a selectable marker. The selectable marker allows only cells 
of the desired type to survive. For example, stem cells can be induced to differentiate into 

20 cardiomyocytes (Wobus et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. 
Invest., 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. W. In: Principles of 
Tissue Engineering eds. Lanza etal, Academic Press (1997)). Alternatively, directed 
differentiation of stem cells can be accomplished by culturing the stem cells in the presence 
of a differentiation factor such as retinoic acid and an antagonist of the polypeptide of the 

25 invention which would inhibit the effects of endogenous stem cell factor activity and allow 
differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the 
invention exhibits stem cell growth factor activity. Stem cells are isolated from any one of 
various cell sources (including hematopoietic stem cells and embryonic stem cells) and 

30 cultured on a feeder layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 
92: 7844-7848 (1995), in the presence of the polypeptide of the invention alone or in 
combination with other growth factors or cytokines. The ability of the polypeptide of the 
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invention to induce stem cells proliferation is determined by colony formation on semi-solid 
support e.g. as described by Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

5 A polypeptide of the present invention may be involved in regulation of 

hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell disorders. 
Even marginal biological activity in support of colony forming cells or of factor-dependent 
cell lines indicates involvement in regulating hematopoiesis, e.g. in supporting the growth 
and proliferation of erythroid progenitor cells alone or in combination with other cytokines, 

10 thereby indicating utility, for example, in treating various anemias or for use in conjunction 
with irradiation/chemotherapy to stimulate the production of erythroid precursors and/or 
erythroid cells; in supporting the growth and proliferation of myeloid cells such as 
granulocytes and monocytes/macrophages (i.e., traditional CSF activity) useful, for example, 
in conjunction with chemotherapy to prevent or treat consequent myelo-suppression; in 

1 5 supporting the growth and proliferation of megakaryocytes and consequently of platelets 
thereby allowing prevention or treatment of various platelet disorders such as 
thrombocytopenia, and generally for use in place of or complimentary to platelet 
transfusions; and/or in supporting the growth and proliferation of hematopoietic stem cells 
which are capable of maturing to any and all of the above-mentioned hematopoietic cells and 

20 therefore find therapeutic utility in various stem cell disorders (such as those usually treated 
with transplantation, including, without limitation, aplastic anemia and paroxysmal nocturnal 
hemoglobinuria), as well as in repopulating the stem cell compartment post 
irradiation/chemotherapy^ either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or 

25 heterologous)) as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 
Suitable assays for proliferation and differentiation of various hematopoietic lines are 
cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
30 proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., 
Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 
1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. 
R. L Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; 
5 Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic 
colony forming cells with high proliferative potential, McNiece, I. K. and Briddell, R. A. In 
Culture of Hematopoietic Cells. R. L Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., 
New York, N.Y. 1994; Neben et al., Experimental Hematology 22:353-359, 1994; 
Cobblestone area forming cell assay, Ploemacher, R. E. In Culture of Hematopoietic Cells. 

10 R. I. Freshney, et al. eds. Vol pp. 1-21, Wiley-Liss, Inc., New York, N.Y. 1994; Long term 
bone marrow cultures in the presence of stromal cells, Spooncer, E., Dexter, M. and Allen, 
T. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, 
Inc., New Vork, N.Y. 1994; Long term culture initiating cell assay, Sutherland, H. J. In 
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 139-162, Wiley-Liss, Inc., 

15 New York, N.Y. 1994. 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, 
tendon, ligament and/or nerve tissue growth or regeneration, as well as in wound healing and 

20 tissue repair and replacement, and in healing of burns, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 

25 prophylactic use in closed as well as open fracture reduction and also in the improved 
fixation of artificial joints. De novo bone formation induced by an osteogenic agent 
contributes to the repair of congenital, trauma induced, or oncologic resection induced 
craniofacial defects, and also is useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming 

30 cells, stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by 
blocking inflammation or processes of tissue destruction (collagenase activity, osteoclast 
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activity, etc.) mediated by inflammatory processes may also be possible using the 
composition of the invention. 

Another category of tissue regeneration activity that may involve the polypeptide of 
the present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue 
5 or other tissue formation in circumstances where such tissue is not normally formed, has 
application in the healing of tendon or ligament tears, deformities and other tendon or 
ligament defects in humans and other animals. Such a preparation employing a 
tendon/ligament-like tissue inducing protein may have prophylactic use in preventing 
damage to tendon or ligament tissue, as well as use in the improved fixation of tendon or 

1 0 ligament to bone or other tissues, and in repairing defects to tendon or ligament tissue. De 
novo tendon/ligament-like tissue formation induced by a composition of the present 
invention contributes to the repair of congenital, trauma induced, or other tendon or ligament 
defects of other origin, and is also useful in cosmetic plastic surgery for attachment or repair 
of tendons or ligaments. The compositions of the present invention may provide 

1 5 environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 

ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to 
effect tissue repair. The compositions of the invention may also be useful in the treatment of 
tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The compositions 

20 may also include an appropriate matrix and/or sequestering agent as a carrier as is well 
known in the art. 

The compositions of the present invention may also be useful for proliferation of 
neural cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and 

25 traumatic disorders, which involve degeneration, death or trauma to neural cells or nerve 
tissue. More specifically, a composition may be used in the treatment of diseases of the 
peripheral nervous system, such as peripheral nerve injuries, peripheral neuropathy and 
localized neuropathies, and central nervous system diseases, such as Alzheimer's, 
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy-Drager 

30 syndrome. Further conditions which may be treated in accordance with the present invention 
include mechanical and traumatic disorders, such as spinal cord disorders, head trauma and 
cerebrovascular diseases such as stroke. Peripheral neuropathies resulting from 
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chemotherapy or other medical therapies may also be treatable using a composition of the 
invention. 

Compositions of the invention may also be useful to promote better or faster closure 
of non-healing wounds, including without limitation pressure ulcers, ulcers associated with 
5 vascular insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, 
intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular 
(including vascular endothelium) tissue, or for promoting the growth of cells comprising 
1 0 such tissues. Part of the desired effects may be by inhibition or modulation of fibrotic 

scarring may allow normal tissue to regenerate. A polypeptide of the present invention may 
also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
1 5 conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or 
inhibiting differentiation of tissues described above from precursor tissues or cells; or for 
inhibiting the growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 
20 Assays for tissue generation activity include, without limitation, those described in: 

International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International 
Patent Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: 
25 Winter, Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), 

Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. 
Dermatol 71:382-84(1978). 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

30 A polypeptide of the present invention may also exhibit immune stimulating or 

immune suppressing activity, including without limitation the activities for which assays are 
described herein. A polynucleotide of the invention can encode a polypeptide exhibiting 
such activities. A protein may be useful in the treatment of various immune deficiencies and 
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disorders (including severe combined immunodeficiency (SCID)), e.g., in regulating (up or 
down) growth and proliferation of T and/or B lymphocytes, as well as effecting the cytolytic 
activity of NK cells and other cell populations. These immune deficiencies may be genetic or 
be caused by viral (e.g., HIV) as well as bacterial or fungal infections, or may result from 
5 autoimmune disorders. More specifically, infectious diseases causes by viral, bacterial, 

fungal or other infection may be treatable using a protein of the present invention, including 
infections by HIV, hepatitis viruses, herpes viruses, mycobacteria, Leishmania spp., malaria 
spp. and various fungal infections such as candidiasis. Of course, in this regard, proteins of 
the present invention may also be useful where a boost to the immune system generally may 

1 0 be desirable, i.e., in the treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus 
erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre 
syndrome, autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, 

1 5 graft-versus-host disease and autoimmune inflammatory eye disease. Such a protein (or 
antagonists thereof, including antibodies) of the present invention may also to be useful in 
the treatment of allergic reactions and conditions (e.g., anaphylaxis, serum sickness, drug 
reactions, food allergies, insect venom allergies, mastocytosis, allergic rhinitis, 
hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic dermatitis, allergic 

20 contact dermatitis, erythema multiforme, Stevens-Johnson syndrome, allergic conjunctivitis, 
atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant papillary conjunctivitis and 
contact allergies), such as asthma (particularly allergic asthma) or other respiratory 
problems. Other conditions, in which immune suppression is desired (including, for 
example, organ transplantation), may also be treatable using a protein (or antagonists 

25 thereof) of the present invention. The therapeutic effects of the polypeptides or antagonists 
thereof on allergic reactions can be evaluated by in vivo animals models such as the 
cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 1998), skin 
prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization test 
(Vohr et al., Arch. Toxocol. 73: 501 -9), and murine local lymph node assay (Kimber et al., 

30 J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or 
blocking an immune response already in progress or may involve preventing the induction of 
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an immune response. The functions of activated T cells may be inhibited by suppressing T 
cell responses or by inducing specific tolerance in T cells, or both. Immunosuppression of T 
cell responses is generally an active, non-antigen-specific, process which requires continuous 
exposure of the T cells to the suppressive agent. Tolerance, which involves inducing 
5 non-responsiveness or anergy in T cells, is distinguishable from immunosuppression in that 
it is generally antigen-specific and persists after exposure to the tolerizing agent has ceased. 
Operationally, tolerance can be demonstrated by the lack of a T cell response upon 
reexposure to specific antigen in the absence of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

10 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin 
and organ transplantation and in graft-versus-host disease (GVHD). For example, blockage 
of T cell function should result in reduced tissue destruction in tissue transplantation. 
Typically, in tissue transplants, rejection of the transplant is initiated through its recognition 

1 5 as foreign by T cells, followed by an immune reaction that destroys the transplant. The 

administration of a therapeutic composition of the invention may prevent cytokine synthesis 
by immune cells, such as T cells, and thus acts as an immunosuppressant. Moreover, a lack 
of costimulation may also be sufficient to anergize the T cells, thereby inducing tolerance in 
a subject. Induction of long-term tolerance by B lymphocyte antigen-blocking reagents may 

20 avoid the necessity of repeated administration of these blocking reagents. To achieve 

sufficient immunosuppression or tolerance in a subject, it may also be necessary to block the 
function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 

25 humans. Examples of appropriate systems which can be used include allogeneic cardiac 
grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been 
used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as 
described in Lenschow et al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. 
Sci USA, 89:1 1 102-1 1 105 (1992). In addition, murine models of GVHD (see Paul ed., 

30 Fundamental Immunology, Raven Press, New York, 1989, pp. 846-847) can be used to 

determine the effect of therapeutic compositions of the invention on the development of that 
disease. 
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Blocking antigen function may also be therapeutically useful for treating 
autoimmune diseases. Many autoimmune disorders are the result of inappropriate activation 
of T cells that are reactive against self-tissue and which promote the production of cytokines 
and autoantibodies involved in the pathology of the diseases. Preventing the activation of 
5 autoreactive T cells may reduce or eliminate disease symptoms. Administration of reagents 
which block stimulation of T cells can be used to inhibit T cell activation and prevent 
production of autoantibodies or T cell-derived cytokines which may be involved in the 
disease process. Additionally, blocking reagents may induce antigen-specific tolerance of 
autoreactive T cells which could lead to long-term relief from the disease. The efficacy of 

1 0 blocking reagents in preventing or alleviating autoimmune disorders can be determined 
using a number of well-characterized animal models of human autoimmune diseases. 
Examples include murine experimental autoimmune encephalitis, systemic lupus 
erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune collagen 
arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental myasthenia 

1 5 gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a 
means of up regulating immune responses, may also be useful in therapy. Upregulation of 
immune responses may be in the form of enhancing an existing immune response or eliciting 

20 an initial immune response. For example, enhancing an immune response may be useful in 
cases of viral infection, including systemic viral diseases such as influenza, the common 
cold, and encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 

25 APCs either expressing a peptide of the present invention or together with a stimulatory 
form of a soluble peptide of the present invention and reintroducing the in vitro activated T 
cells into the patient. Another method of enhancing anti-viral immune responses would be to 
isolate infected cells from a patient, transfect them with a nucleic acid encoding a protein of 
the present invention as described herein such that the cells express all or a portion of the 

30 protein on their surface, and reintroduce the transfected cells into the patient. The infected 
cells would now be capable of delivering a costimulatory signal to, and thereby activate, T 
cells in vivo. 
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A polypeptide of the present invention may provide the necessary stimulation signal 
to T cells to induce a T cell mediated immune response against the transfected tumor cells. 
In addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected 
5 with nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) 
of an MHC class I alpha chain protein and P2 microglobulin protein or an MHC class II 
alpha chain protein and an MHC class II beta chain protein to thereby express MHC class I 
or MHC class II proteins on the cell surface. Expression of the appropriate class I or class II 
MHC in conjunction with a peptide having the activity of a B lymphocyte antigen (e.g., 

10 B7-1, B7-2, B7-3) induces a T cell mediated immune response against the transfected tumor 
cell. Optionally, a gene encoding an antisense construct which blocks expression of an MHC 
class II associated protein, such as the invariant chain, can also be cotransfected with a DNA 
encoding a peptide having the activity of a B lymphocyte antigen to promote presentation of 
tumor associated antigens and induce tumor specific immunity. Thus, the induction of a T 

15 cell mediated immune response in a human subject may be sufficient to overcome 
tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 

20 those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 

25 Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., 
J. Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et 
al., Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 
1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching 
30 (which will identify, among others, proteins that modulate T-cell dependent antibody 

responses and that affect Thl/Th2 profiles) include, without limitation, those described in: 
Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro 
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antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. 
E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, 
proteins that generate predominantly Thl and CTL responses) include, without limitation, 
5 those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; 
Takai et al., J. Immunol. 140:508-512, 1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 
10 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins 
expressed by dendritic cells that activate naive T-cells) include, without limitation, those 
described in: Guery et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of 
Experimental Medicine 173:549-559, 1991; Macatonia et al., Journal of Immunology 

1 5 154:5071-5079, 1995; Porgador et al., Journal of Experimental Medicine 182:255-260, 
1995; Nair et al., Journal of Virology 67:4062-4069, 1993; Huang et al., Science 
264:961-965, 1994; Macatonia et al., Journal of Experimental Medicine 169:1255-1264, 
1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and Inaba et al., 
Journal of Experimental Medicine 172:631-640, 1990. 

20 Assays for lymphocyte survival/apoptosis (which will identify, among others, 

proteins that prevent apoptosis after superantigen induction and proteins that regulate 
lymphocyte homeostasis) include, without limitation, those described in: Darzynkiewicz et 
al., Cytometry 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et 
al., Cancer Research 53: 1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, 

25 Journal of Immunology 145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; 
Gorczyca et al., International Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine 
et al., Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; 

30 Toki et al., Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 



4.10.8 ACTIVIN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate 
5 the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present 

invention, alone or in heterodimers with a member of the inhibin family, may be useful as a 
contraceptive based on the ability of inhibins to decrease fertility in female mammals and 
decrease spermatogenesis in male mammals. Administration of sufficient amounts of other 
inhibins can induce infertility in these mammals. Alternatively, the polypeptide of the 

1 0 invention, as a homodimer or as a heterodimer with other protein subunits of the inhibin 
group, may be useful as a fertility inducing therapeutic, based upon the ability of activin 
molecules in stimulating FSH release from cells of the anterior pituitary. See, for example, 
U.S. Pat. No. 4,798,885. A polypeptide of the invention may also be useful for advancement 
of the onset of fertility in sexually immature mammals, so as to increase the lifetime 

1 5 reproductive performance of domestic animals such as, but not limited to, cows, sheep and 
pigs. 

The activity of a polypeptide of the invention may, among other means, be measured 
by the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: 
20 Vale et al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et 
al., Nature 321 :776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. 
Natl. Acad. Sci. USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

25 A polypeptide of the present invention may be involved in chemotactic or 

chemokinetic activity for mammalian cells, including, for example, monocytes, fibroblasts, 
neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. 
Chemotactic and chemokinetic receptor activation can be used to mobilize or attract a 

30 desired cell population to a desired site of action. Chemotactic or chemokinetic compositions 
(e.g. proteins, antibodies, binding partners, or modulators of the invention) provide particular 
advantages in treatment of wounds and other trauma to tissues, as well as in treatment of 
localized infections. For example, attraction of lymphocytes, monocytes or neutrophils to 
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tumors or sites of infection may result in improved immune responses against the tumor or 
infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
stimulate, directly or indirectly, the directed orientation or movement of such cell 
5 population. Preferably, the protein or peptide has the ability to directly stimulate directed 
movement of cells. Whether a particular protein has chemotactic activity for a population of 
cells can be readily determined by employing such protein or peptide in any known assay for 
cell chemotaxis. 

Therapeutic compositions of the invention can be used in the following: 
10 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of 
cells across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. 
15 Coligan, A. M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene 

Publishing Associates and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta 
Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95: 1370-1376, 1995; Lind et al. 
APMIS 103:140-146, 1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of 
Immunol. 152:5860-5867, 1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

20 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders 
25 (including hereditary disorders, such as hemophilias) or to enhance coagulation and other 
. hemostatic events in treating wounds resulting from trauma, surgery or other causes. A 
composition of the invention may also be useful for dissolving or inhibiting formation of 
thromboses and for treatment and prevention of conditions resulting therefrom (such as, for 
example, infarction of cardiac and central nervous system vessels (e.g., stroke). 
30 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis 
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Res. 45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, 
Prostaglandins 35:467-474, 1988. 

4J0.11 CANCER DIAGNOSIS AND THERAPY 

5 Polypeptides of the invention may be involved in cancer cell generation, proliferation 

or metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. 
For example, the presence or increased expression of a polynucleotide/polypeptide of the 
invention may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing 
10 malignancy. Conversely, a defect in the gene or absence of the polypeptide may be 
associated with a cancer condition. Identification of single nucleotide polymorphisms 
associated with cancer or a predisposition to cancer may also be useful for diagnosis or 
prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

1 5 inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor 
growth) and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. 
Therapeutic compositions of the invention may be effective in adult and pediatric oncology 
including in solid phase tumors/malignancies, locally advanced tumors, human soft tissue 
sarcomas, metastatic cancer, including lymphatic metastases, blood cell malignancies 

20 including multiple myeloma, acute and chronic leukemias, and lymphomas, head and neck 
cancers including mouth cancer, larynx cancer and thyroid cancer, lung cancers including 
small cell carcinoma and non-small cell cancers, breast cancers including small cell 
carcinoma and ductal carcinoma, gastrointestinal cancers including esophageal cancer, 
stomach cancer, colon cancer, colorectal cancer and polyps associated with colorectal 

25 neoplasia, pancreatic cancers, liver cancer, urologic cancers including bladder cancer and 

prostate cancer, malignancies of the female genital tract including ovarian carcinoma, uterine 
(including endometrial) cancers, and solid tumor in the ovarian follicle, kidney cancers 
including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

30 nervous system, bone cancers including osteomas, skin cancers including malignant 

melanoma, tumor progression of human skin keratinocytes, squamous cell carcinoma, basal 
cell carcinoma, hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention 
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(including inhibitors and stimulators of the biological activity of the polypeptide of the 
invention) may be administered to treat cancer. Therapeutic compositions can be 
administered in therapeutically effective dosages alone or in combination with adjuvant 
cancer therapy such as surgery, chemotherapy, radiotherapy, thermotherapy, and laser 
5 therapy, and may provide a beneficial effect, e.g. reducing tumor size, slowing rate of tumor 
growth, inhibiting metastasis, or otherwise improving overall clinical condition, without 
necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 

10 modulator of the invention with one or more anti-cancer drugs in addition to a 

pharmaceutical^ acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer 
treatment is routine. Anti-cancer drugs that are well known in the art and can be used as a 
treatment in combination with the polypeptide or modulator of the invention include: 
Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan, Carboplatin, 

15 Carmustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, Cytarabine HC1 

(Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HC1, Doxorubicin HC1, 
Estramustine phosphate sodium, Etoposide (VI 6-2 13), Floxuridine, 5-Fluorouracil (5-Fu), 
Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon Alpha-2a, Interferon 
Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), Lomustine, Mechlorethamine 

20 HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, Methotrexate (MTX), 

Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, Streptozocin, 
Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 
Semustine, Teniposide, and Vindesine sulfate. 

25 In addition, therapeutic compositions of the invention may be used for prophylactic 

treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing 
cancers. Under these circumstances, it may be beneficial to treat these individuals with 
therapeutically effective doses of the polypeptide of the invention to reduce the risk of 

30 developing cancers. 

In viiro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays 
of cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1 987) 
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Culture of Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 
and Ch 21), tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 
52: 921-30 (1974), mobility and invasive potential of tumor cells in Boyden Chamber assays 
as described in Pilkington et al, Anticancer Res., 17: 4107-9 (1997), and angiogenesis 
5 assays such as induction of vascularization of the chick chorioallantoic membrane or 
induction of vascular endothelial cell migration as described in Ribatta et al., Intl. J. Dev. 
Biol., 40: 1 189-97 (1999) and Li et al., Clin. Exp. Metastasis, 17:423-9 (1999), respectively. 
Suitable tumor cells lines are available, e.g. from American Type Tissue Culture Collection 
catalogs. 

10 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of 
the invention can encode a polypeptide exhibiting such characteristics. Examples of such 

15 receptors and ligands include, without limitation, cytokine receptors and their ligands, 
receptor kinases and their ligands, receptor phosphatases and their ligands, receptors 
involved in cell-cell interactions and their ligands (including without limitation, cellular 
adhesion molecules (such as selectins, integrins and their ligands) and receptor/ligand pairs 
involved in antigen presentation, antigen recognition and development of cellular and 

20 humoral immune responses. Receptors and ligands are also useful for screening of potential 
peptide or small molecule inhibitors of the relevant receptor/ligand interaction. A protein of 
the present invention (including, without limitation, fragments of receptors and ligands) may 
themselves be useful as inhibitors of receplor/ligand interactions. 

The activity of a polypeptide of the invention may, among other means, be measured 

25 by the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- 
Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static conditions 

30 7.28.1- 7.28.22), Takai et al., Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., 
J. Exp. Med. 168:1145-1156, 1988; Rosenstein et al., J.Exp. Med. 169:149-160 1989; 
Stoltenborg et al., J. Immunol. Methods 175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 



WO 03/025148 



PCT/US02/29964 



64 

By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be 
identified through binding assays, affinity chromatography, dihybrid screening assays, 
BIAcore assays, gel overlay assays, or other methods known in the art. 
5 Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or 

a partial antagonist require the use of other proteins as competing ligands. The polypeptides 
of the present invention or ligand(s) thereof may be labeled by being coupled to 
radioisotopes, colorimetric molecules or a toxin molecules by conventional methods. 
("Guide to Protein Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 
10 (1990) Academic Press, Inc. San Diego). Examples of radioisotopes include, but are not 
limited to, tritium and carbon-14 . Examples of colorimetric molecules include, but are not 
limited to, fluorescent molecules such as fluorescamine, or rhodamine or other colorimetric 
molecules. Examples of toxins include, but are not limited, to ricin. 

15 4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening 
techniques. The polypeptides or fragments employed in such a test may either be free in 
solution, affixed to a solid support, borne on a cell surface or located intracellularly. One 

20 method of drug screening utilizes eukaryotic or prokaryotic host cells which are stably 
transformed with recombinant nucleic acids expressing the polypeptide or a fragment 
thereof. Drugs are screened against such transformed cells in competitive binding assays. 
Such cells, either in viable or fixed form, can be used for standard binding assays. One may 
measure, for example, the formation of complexes between polypeptides of the invention or 

25 fragments and the agent being tested or examine the diminution in complex formation 

between the novel polypeptides and an appropriate cell line, which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate 
(i.e., increase or decrease) the activity of polypeptides of the invention include (1) inorganic 
and organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

30 comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 
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The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or 
marine microorganisms or (2) extraction of the organisms themselves. Natural product 
5 libraries include polyketides, non-ribosomal peptides, and (non-naturally occurring) variants 
thereof. For a review, see Science 252:63-68 (1 998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides 
or organic compounds and can be readily prepared by traditional automated synthesis 
methods, PCR, cloning or proprietary synthetic methods. Of particular interest are peptide 

1 0 and oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, 
protein, peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide 
libraries. For a review of combinatorial chemistry and libraries created therefrom, see 
Myers, Curr. Opin. Biotechnol 8:701-707 (1997). For reviews and examples of 
peptidomimetic libraries, see Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby 

1 5 et al., Curr Opin Chem Biol, 1(1):1 14-19 (1997); Domer et al., Bioorg Med Chem, 
4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein 
permits modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" 
to bind a polypeptide of the invention. The molecules identified in the binding assay are then 

20 tested for antagonist or agonist activity in in vivo tissue culture or animal models that are 
well known in the art. In brief, the molecules are titrated into a plurality of cell cultures or 
animals and then tested for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The 

25 toxin-binding molecule complex is then targeted to a tumor or other cell by the specificity of 
the binding molecule for a polypeptide of the invention. Alternatively, the binding 
molecules may be complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

30 The invention also provides methods to detect specific binding of a polypeptide e.g. a 

ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For 
example, expression cloning using mammalian or bacterial cells, or dihybrid screening 
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assays can be used to identify polynucleotides encoding binding partners. As another 
example, affinity chromatography with the appropriate immobilized polypeptide of the 
invention can be used to isolate polypeptides that recognize and bind polypeptides of the 
invention. There are a number of different libraries used for the identification of 
5 compounds, and in particular small molecules, that modulate (i.e., increase or decrease) 

biological activity of a polypeptide of the invention. Ligands for receptor polypeptides of the 
invention can also be identified by adding exogenous ligands, or cocktails of ligands to two 
cells populations that are genetically identical except for the expression of the receptor of the 
invention: one cell population expresses the receptor of the invention whereas the other does 

10 not. The responses of the two cell populations to the addition of ligands(s) are then 

compared. Alternatively, an expression library can be co-expressed with the polypeptide of 
the invention in cells and assayed for an autocrine response to identify potential Hgand(s). As 
still another example, BIAcore assays, gel overlay assays, or other methods known in the art 
can be used to identify binding partner polypeptides, including, (1) organic and inorganic 

15 chemical libraries, (2) natural product libraries, and (3) combinatorial libraries comprised of 
random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of 
the polypeptide of the invention can be determined. For example, a chimeric protein in 
which the cytoplasmic domain of the polypeptide of the invention is fused to the 

20 extracellular portion of a protein, whose ligand has been identified, is produced in a host 
cell. The cell is then incubated with the ligand specific for the extracellular portion of the 
chimeric protein, thereby activating the chimeric receptor. Known downstream proteins 
involved in intracellular signaling can then be assayed for expected modifications i.e. 
phosphorylation. Other methods known to those in the art can also be used to identify 

25 signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. 
The anti-inflammatory activity may be achieved by providing a stimulus to cells involved in 
30 the inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for 
example, cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the 
inflammatory process, inhibiting or promoting cell extravasation, or by stimulating or 
suppressing production of other factors which more directly inhibit or promote an 
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inflammatory response. Compositions with such activities can be used to treat inflammatory 
conditions including chronic or acute conditions), including without limitation intimation 
associated with infection (such as septic shock, sepsis or systemic inflammatory response 
syndrome (SIRS)), ischemia-reperfusion injury, endotoxin lethality, arthritis, 
5 complement-mediated hyperacute rejection, nephritis, cytokine or chemokine-induced lung 
injury, inflammatory bowel disease, Crohn's disease or resulting from over production of 
cytokines such as TNF or IL-1 . Compositions of the invention may also be useful to treat 
anaphylaxis and hypersensitivity to an antigenic substance or material. Compositions of this 
invention may be utilized to prevent or treat conditions such as, but not limited to, sepsis, 

10 acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid arthritis, chronic 
inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, graft versus 
host disease, inflammatory bowel disease, inflamation associated with pulmonary disease, 
other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

15 intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of 
20 the invention. Such leukemias and related disorders include but are not limited to acute 
leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, 
promyelocyte, myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic 
myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such 
disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

25 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
30 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient 
(including human and non-human mammalian patients) according to the invention include 
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but are not limited to the following lesions of either the central (including spinal cord, brain) 
or peripheral nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated 
with surgery, for example, lesions which sever a portion of the nervous system, or 

5 compression injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 
infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
1 0 injured as a result of infection, for example, by an abscess or associated with infection by 

human immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme 
disease, tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration 

1 5 associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or 
amyotrophic lateral sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion of 
the nervous system is destroyed or injured by a nutritional disorder or disorder of 
metabolism including but not limited to, vitamin B12 deficiency, folic acid deficiency, 

20 Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary 
degeneration of the corpus callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not 
limited to diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, 
carcinoma, or sarcoidosis; 

25 (vii) lesions caused by toxic substances including alcohol, lead, or particular 

neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is destroyed or 
injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various 
30 etiologies, progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 
system disorder may be selected by testing for biological activity in promoting the survival 
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or differentiation of neurons. For example, and not by way of limitation, therapeutics which 
elicit any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in vivo, 

e.g., choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method 

10 set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons 
may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or 
Brown et al. (1981, Ann. Rev. Neurosci. 4:17-42); increased production of 
neuron-associated molecules may be measured by bioassay, enzymatic assay, antibody 
binding, Northern blot assay, etc., depending on the molecule to be measured; and motor 

1 5 neuron dysfunction may be measured by assessing the physical manifestation of motor 

neuron disorder, e.g., weakness, motor neuron conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to 
toxin, trauma, surgical damage, degenerative disease or malignancy that may affect motor 

20 neurons as well as other components of the nervous system, as well as disorders that 

selectively affect neurons such as amyotrophic lateral sclerosis, and including but not limited 
to progressive spinal muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, 
infantile and juvenile muscular atrophy, progressive bulbar paralysis of childhood (Fazio- 
Londe syndrome), poliomyelitis and the post polio syndrome, and Hereditary Motorsensory 

25 Neuropathy (Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following 
additional activities or effects: inhibiting the growth, infection or function of, or killing, 
30 infectious agents, including, without limitation, bacteria, viruses, fungi and other parasites; 
effecting (suppressing or enhancing) bodily characteristics, including, without limitation, 
height, weight, hair color, eye color, skin, fat to lean ratio or other tissue pigmentation, or 
organ or body part size or shape (such as, for example, breast augmentation or diminution, 
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change in bone form or shape); effecting biorhythms or circadian cycles or rhythms; 
effecting the fertility of male or female subjects; effecting the metabolism, catabolism, 
anabolism, processing, utilization, storage or elimination of dietary fat, lipid, protein, 
carbohydrate, vitamins, minerals, co- factors or other nutritional factors or component(s); 
5 effecting behavioral characteristics, including, without limitation, appetite, libido, stress, 
cognition (including cognitive disorders), depression (including depressive disorders) and 
violent behaviors; providing analgesic effects or other pain reducing effects; promoting 
differentiation and growth of embryonic stem cells in lineages other than hematopoietic 
lineages; hormonal or endocrine activity; in the case of enzymes, correcting deficiencies of 
1 0 the enzyme and treating deficiency-related diseases; treatment of hyperproliferative 
disorders (such as, for example, psoriasis); immunoglobulin-like activity (such as, for 
example, the ability to bind antigens or complement); and the ability to act as an antigen in a 
vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

15 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for 
diagnosis and treatment. Such polymorphisms may be associated with, e.g., differential 

20 predisposition or susceptibility to various disease states (such as disorders involving 

inflammation or immune response) or a differential response to drug administration, and this 
genetic information can be used to tailor preventive or therapeutic treatment appropriately. 
For example, the existence of a polymorphism associated with a predisposition to 
inflammation or autoimmune disease makes possible the diagnosis of this condition in 

25 humans by identi fying the presence of the polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, 
optionally involving isolation or amplification of the DNA, and identifying the presence of 
the polymorphism in the DNA. For example, PCR may be used to amplify an appropriate 

30 fragment of genomic DNA which may then be sequenced. Alternatively, the DNA may be 
subjected to allele-specific oligonucleotide hybridization (in which appropriate 
oligonucleotides are hybridized to the DNA under conditions permitting detection of a single 
base mismatch) or to a single nucleotide extension assay (in which an oligonucleotide that 
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hybridizes immediately adjacent to the position of the polymorphism is extended with one or 
more labeled nucleotides). In addition, traditional restriction fragment length polymorphism 
analysis (using restriction enzymes that provide differential digestion of the genomic DNA 
depending on the presence or absence of the polymorphism) may be performed. Arrays with 
5 nucleotide sequences of the present invention can be used to detect polymorphisms. The 
array can comprise modified nucleotide sequences of the present invention in order to detect 
the nucleotide sequences of the present invention. In the alternative, any one of the 
nucleotide sequences of the present invention can be placed on the array to detect changes 
from those sequences. 

10 Alternatively a polymorphism resulting in a change in the amino acid sequence could 

also be detected by detecting a corresponding change in amino acid sequence of the protein, 
e.g., by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

15 The immunosuppressive effects of the compositions of the invention against 

rheumatoid arthritis is determined in an experimental animal model system. The 
experimental model system is adjuvant induced arthritis in rats, and the protocol is described 
by J. Holoshitz, et at, 1983, Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. 
Allergy Appl. Immunol., 23:129. Induction of the disease can be caused by a single 

20 injection, generally intradermally, of a suspension of killed Mycobacterium tuberculosis in 
complete Freund's adjuvant (CFA). The route of injection can vary, but rats may be injected 
at the baseof the tail with an adjuvant mixture. The polypeptide is administered in phosphate 
buffered solution (PBS) at a dose of about 1-5 mg/kg. The control consists of administering 
PBS only. 

25 The procedure for testing the effects of the test compound would consist of 

intradermally injecting killed Mycobacterium tuberculosis in CFA followed by immediately 
administering the test compound and subsequent treatment every other day until day 24. At 
14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an overall arthritis 
score may be obtained as described by J. Holoskitz above. An analysis of the data would 

30 reveal that the test compound would have a dramatic affect on the swelling of the joints as 
measured by a decrease of the arthritis score. 



4.11 THERAPEUTIC METHODS 
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The compositions (including polypeptide fragments, analogs, variants and antibodies 
or other binding partners or modulators including antisense polynucleotides) of the invention 
have numerous applications in a variety of therapeutic methods. Examples of therapeutic 
applications include, but are not limited to, those exemplified herein. 

5 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode 

10 of administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, 
weight, condition and response of the individual patient. Typically, the amount of 

1 5 polypeptide administered per dose will be in the range of about 0.01 jig/kg to 100 mg/kg of 
body weight, with the preferred dose being about 0. 1 ng/kg to 10 mg/kg of patient body 
weight. For parenteral administration, polypeptides of the invention will be formulated in an 
injectable form combined with a pharmaceutically acceptable parenteral vehicle. Such 
vehicles are well known in the art and examples include water, saline, Ringer's solution, 

20 dextrose solution, and solutions consisting of small amounts of the human serum albumin. 
The vehicle may contain minor amounts of additives that maintain the isotonicity and 
stability of the polypeptide or other active ingredient. The preparation of such solutions is 
within the skill of the art. 

25 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 

ADMINISTRATION 

A protein or other composition of the present invention (from whatever source 
derived, including without limitation from recombinant and non-recombinant sources and 
including antibodies and other binding partners of the polypeptides of the invention) may be 
30 administered to a patient in need, by itself, or in pharmaceutical compositions where it is 
mixed with suitable carriers or excipient(s) at doses to treat or ameliorate a variety of 
disorders. Such a composition may optionally contain (in addition to protein or other active 
ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers, solubilizers, and other 
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materials well known in the art. The term "pharmaceutical^ acceptable" means a non-toxic 
material that does not interfere with the effectiveness of the biological activity of the active 
ingredient(s). The characteristics of the carrier will depend on the route of administration. 
The pharmaceutical composition of the invention may also contain cytokines, lymphokines, 
5 or other hematopoietic factors such as M-CSF, GM-CSF, TNF, IL-1 , IL-2, IL-3, IL-4, IL-5, 
IL-6, IL-7, IL-8, IL-9, IL-10, IL-1 1, IL-12, IL-13, IL-14, IL-15, EFN, TNFO, TNF1 , TNF2, 
G-CSF, Meg-CSF, thrombopoietin, stem cell factor, and erythropoietin. In further 
compositions, proteins of the invention may be combined with other agents beneficial to the 
treatment of the disease or disorder in question. These agents include various growth factors 
1 0 such as epidermal growth factor (EGF), platelet-derived growth factor (PDGF), transforming 
growth factors (TGF-ct and TGF-P), insulin-like growth factor (IGF), as well as cytokines 
described herein. 

The pharmaceutical composition may further contain other agents which either 
enhance the activity of the protein or other active ingredient or complement its activity or 

1 5 use in treatment. Such additional factors and/or agents may be included in the 

pharmaceutical composition to produce a synergistic effect with protein or other active 
ingredient of the invention, or to minimize side effects. Conversely, protein or other active 
ingredient of the present invention may be included in formulations of the particular clotting 
factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or antithrombotic 

20 factor, or anti- inflammatory agent to minimize side effects of the clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or antithrombotic factor, or 
anti-inflammatory agent (such as IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, 
immunosuppressive agents). A protein of the present invention may be active in multimers 
(e.g., heterodimers or homodimers) or complexes with itself or other proteins. As a result, 

25 pharmaceutical compositions of the invention may comprise a protein of the invention in 
such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 

30 therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application 
may be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, 
latest edition. A therapeutically effective dose further refers to that amount of the compound 
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sufficient to result in amelioration of symptoms, e.g., treatment, healing, prevention or 
amelioration of the relevant medical condition, or an increase in rate of treatment, healing, 
prevention or amelioration of such conditions. When applied to an individual active 
ingredient, administered alone, a therapeutically effective dose refers to that ingredient 
5 alone. When applied to a combination, a therapeutically effective dose refers to combined 
amounts of the active ingredients that result in the therapeutic effect, whether administered 
in combination, serially or simultaneously. 

In practicing the method of treatment or use of the present invention, a 
therapeutically effective amount of protein or other active ingredient of the present invention 

1 0 is administered to a mammal having a condition to be treated. Protein or other active 

ingredient of the present invention may be administered in accordance with the method of 
the invention either alone or in combination with other therapies such as treatments 
employing cytokines, lymphokines or other hematopoietic factors. When co- administered 
with one or more cytokines, lymphokines or other hematopoietic factors, protein or other 

1 5 active ingredient of the present invention may be administered either simultaneously with 
the cytokine(s), lymphokine(s), other hematopoietic factor(s), thrombolytic or 
antithrombotic factors, or sequentially. If administered sequentially, the attending physician 
will decide on the appropriate sequence of administering protein or other active ingredient of 
the present invention in combination with cytokine(s), lymphokine(s), other hematopoietic 

20 factors), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, 
transmucosal, or intestinal administration; parenteral delivery, including intramuscular, 

25 subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, 

intravenous, intraperitoneal, intranasal, or intraocular injections. Administration of protein 
or other active ingredient of the present invention used in the pharmaceutical composition or 
to practice the method of the present invention can be carried out in a variety of conventional 
ways, such as oral ingestion, inhalation, topical application or cutaneous, subcutaneous, 

30 intraperitoneal, parenteral or intravenous injection. Intravenous administration to the patient 
is preferred. 

Alternately, one may administer the compound in a local rather than systemic 
manner, for example, via injection of the compound directly into a arthritic joints or in 
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fibrotic tissue, often in a depot or sustained release formulation. In order to prevent the 
scarring process frequently occurring as complication of glaucoma surgery, the compounds 
may be administered topically, for example, as eye drops. Furthermore, one may administer 
the drug in a targeted drug delivery system, for example, in a liposome coated with a specific 
5 antibody, targeting, for example, arthritic or fibrotic tissue. The liposomes will be targeted 
to and taken up selectively by the afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an 
effective dosage to the desired site of action. The determination of a suitable route of 
administration and an effective dosage for a particular indication is within the level of skill 
10 in the art. Preferably for wound treatment, one administers the therapeutic compound 
directly to the site. Suitable dosage ranges for the polypeptides of the invention can be 
extrapolated from these dosages or from similar studies in appropriate animal models. 
Dosages can then be adjusted as necessary by the clinician to provide maximal therapeutic 
benefit. 

15 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus 
may be formulated in a conventional manner using one or more physiologically acceptable 
carriers comprising excipients and auxiliaries which facilitate processing of the active 

20 compounds into preparations which can be used pharmaceutical^. These pharmaceutical 
compositions may be manufactured in a manner that is itself known, e.g., by means of 
conventional mixing, dissolving, granulating, dragee-making, levigating, emulsifying, 
encapsulating, entrapping or lyophilizing processes. Proper formulation is dependent upon 
the route of administration chosen. When a therapeutically effective amount of protein or 

25 other active ingredient of the present invention is administered orally, protein or other active 
ingredient of the present invention will be in the form of a tablet, capsule, powder, solution 
or elixir. When administered in tablet form, the pharmaceutical composition of the invention 
may additionally contain a solid carrier such as a gelatin or an adjuvant. The tablet, capsule, 
and powder contain from about 5 to 95% protein or other active ingredient of the present 

30 invention, and preferably from about 25 to 90% protein or other active ingredient of the 
present invention. When administered in liquid form, a liquid carrier such as water, 
petroleum, oils of animal or plant origin such as peanut oil, mineral oil, soybean oil, or 
sesame oil, or synthetic oils may be added. The liquid form of the pharmaceutical 
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composition may further contain physiological saline solution, dextrose or other saccharide 
solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. When 
administered in liquid form, the pharmaceutical composition contains from about 0.5 to 90% 
by weight of protein or other active ingredient of the present invention, and preferably from 
5 about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, 
protein or other active ingredient of the present invention will be in the form of a 
pyrogen-free, parenterally acceptable aqueous solution. The preparation of such parenterally 

10 acceptable protein or other active ingredient solutions, having due regard to pH, isotonicity, 
stability, and the like, is within the skill in the art. A preferred pharmaceutical composition 
for intravenous, cutaneous, or subcutaneous injection should contain, in addition to protein 
or other active ingredient of the present invention, an isotonic vehicle such as Sodium 
Chloride Injection, Ringer's Injection, Dextrose Injection, Dextrose and Sodium Chloride 

15 Injection, Lactated Ringer's Injection, or other vehicle as known in the art. The 
pharmaceutical composition of the present invention may also contain stabilizers, 
preservatives, buffers, antioxidants, or other additives known to those of skill in the art. For 
injection, the agents of the invention may be formulated in aqueous solutions, preferably in 
physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 

20 physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in 
the art. 

For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutical^ acceptable carriers well known in the art. Such 

25 carriers enable the compounds of the invention to be formulated as tablets, pills, dragees, 
capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a 
patient to be treated. Pharmaceutical preparations for oral use can be obtained from a solid 
excipient, optionally grinding a resulting mixture, and processing the mixture of granules, 
after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable 

30 excipients are, in particular, fillers such as sugars, including lactose, sucrose, mannitol, or 
sorbitol; cellulose preparations such as, for example, maize starch, wheat starch, rice starch, 
potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, 
sodium carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, 
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disintegrating agents may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or 
alginic acid or a salt thereof such as sodium alginate. Dragee cores are provided with 
suitable coatings. For this purpose, concentrated sugar solutions may be used, which may 
optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, 
5 and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. 
Dyestuffs or pigments may be added to the tablets or dragee coatings for identification or to 
characterize different combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made 
of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol 

1 0 or sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler 
such as lactose, binders such as starches, and/or lubricants such as talc or magnesium 
stearate and, optionally, stabilizers. In soft capsules, the active compounds may be dissolved 
or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene 
glycols. In addition, stabilizers may be added. All formulations for oral administration 

IS should be in dosages suitable for such administration. For buccal administration, the 

compositions may take the form of tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, 

20 dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide 
or other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined 
by providing a valve to deliver a metered amount. Capsules and cartridges of, e.g. y gelatin 
for use in an inhaler or insufflator may be formulated containing a powder mix of the 
compound and a suitable powder base such as lactose or starch. The compounds may be 

25 formulated for parenteral administration by injection, e.g., by bolus injection or continuous 
infusion. Formulations for injection may be presented in unit dosage form, e.g., in ampules 
or in multi-dose containers, with an added preservative. The compositions may take such 
forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and may contain 
formulatory agents such as suspending, stabilizing and/or dispersing agents. 

30 Pharmaceutical formulations for parenteral administration include aqueous solutions 

of the active compounds in water-soluble form. Additionally, suspensions of the active 
compounds may be prepared as appropriate oily injection suspensions. Suitable lipophilic 
solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such 
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as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions may contain 
substances which increase the viscosity of the suspension, such as sodium carboxymethyl 
cellulose, sorbitol, or dextran. Optionally, the suspension may also contain suitable 
stabilizers or agents which increase the solubility of the compounds to allow for the 
5 preparation of highly concentrated solutions. Alternatively, the active ingredient may be in 
powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-free water, before 
use. 

The compounds may also be formulated in rectal compositions such as suppositories 
or retention enemas, e.g., containing conventional suppository bases such as cocoa butter or 

1 0 other glycerides. In addition to the formulations described previously, the compounds may 
also be formulated as a depot preparation. Such long acting formulations may be 
administered by implantation (for example subcutaneously or intramuscularly) or by 
intramuscular injection. Thus, for example, the compounds may be formulated with suitable 
polymeric or hydrophobic materials (for example as an emulsion in an acceptable oil) or ion 

1 5 exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co- 
solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic 
polymer, and an aqueous phase. The co-solvent system may be the VPD co-solvent system. 
VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 

20 80, and 65% w/v polyethylene glycol 300, made up to volume in absolute ethanol. The VPD 
co-solvent system (VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water 
solution. This co-solvent system dissolves hydrophobic compounds well, and itself produces 
low toxicity upon systemic administration. Naturally, the proportions of a co-solvent system 
may be varied considerably without destroying its solubility and toxicity characteristics. 

25 Furthermore, the identity of the co-solvent components may be varied: for example, other 
low-toxicity nonpolar surfactants may be used instead of polysorbate 80; the fraction size of 
polyethylene glycol may be varied; other biocompatible polymers may replace polyethylene 
glycol, e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may substitute for 
dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical compounds 

30 may be employed. Liposomes and emulsions are well known examples of delivery vehicles 
or carriers for hydrophobic drugs. Certain organic solvents such as dimethylsulfoxide also 
may be employed, although usually at the cost of greater toxicity. Additionally, the 
compounds may be delivered using a sustained-release system, such as semipermeable 
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matrices of solid hydrophobic polymers containing the therapeutic agent. Various types of 
sustained-release materials have been established and are well known by those skilled in the 
art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
5 biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase 
carriers or excipients. Examples of such carriers or excipients include but are not limited to 
calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, 

1 0 gelatin, and polymers such as polyethylene glycols. Many of the active ingredients of the 
invention may be provided as salts with pharmaceutically compatible counter ions. Such 
pharmaceutically acceptable base addition salts are those salts which retain the biological 
effectiveness and properties of the free acids and which are obtained by reaction with 
inorganic or organic bases such as sodium hydroxide, magnesium hydroxide, ammonia, 

15 trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium acetate, 
potassium benzoate, triethanol amine and the like. 

The pharmaceutical composition of the invention may be in the form of a complex of 
the protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 

20 lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) 
following presentation of the antigen by MHC proteins. MHC and structurally related 
proteins including those encoded by class I and class II MHC genes on host cells will serve 
to present the peptide antigen(s) to T lymphocytes. The antigen components could also be 

25 supplied as purified MHC-peptide complexes alone or with co-stimulatory molecules that 
can directly signal T cells. Alternatively antibodies able to bind surface immunoglobulin 
and other molecules on B cells as well as antibodies able to bind the TCR and other 
molecules on T cells can be combined with the pharmaceutical composition of the invention. 
The pharmaceutical composition of the invention may be in the form of a liposome in 

30 which protein of the present invention is combined, in addition to other pharmaceutically 

acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. 
Suitable lipids for liposomal formulation include, without limitation, monoglycerides, 
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diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. 
Preparation of such liposomal formulations is within the level of skill in the art, as disclosed, 
for example, in U.S. Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of 
which are incorporated herein by reference. 
5 The amount of protein or other active ingredient of the present invention in the 

pharmaceutical composition of the present invention will depend upon the nature and 
severity of the condition being treated, and on the nature of prior treatments which the 
patient has undergone. Ultimately, the attending physician will decide the amount of protein 
or other active ingredient of the present invention with which to treat each individual patient. 

1 0 Initially, the attending physician will administer low doses of protein or other active 
ingredient of the present invention and observe the patient's response. Larger doses of 
protein or other active ingredient of the present invention may be administered until the 
optimal therapeutic effect is obtained for the patient, and at that point the dosage is not 
increased further. It is contemplated that the various pharmaceutical compositions used to 

15 practice the method of the present invention should contain about 0.01 ng to about 100 mg 
(preferably about 0.1 pg to about 10 mg, more preferably about 0. 1 pg to about 1 mg) of 
protein or other active ingredient of the present invention per kg body weight. For 
compositions of the present invention which are useful for bone, cartilage, tendon or 
ligament regeneration, the therapeutic method includes administering the composition 

20 topically, systematically, or locally as an implant or device. When administered, the 
therapeutic composition for use in this invention is, of course, in a pyrogen- free, 
physiologically acceptable form. Further, the composition may desirably be encapsulated or 
injected in a viscous form for delivery to the site of bone, cartilage or tissue damage. 
Topical administration may be suitable for wound healing and tissue repair. Therapeutically 

25 useful agents other than a protein or other active ingredient of the invention which may also 
optionally be included in the composition as described above, may alternatively or 
additionally, be administered simultaneously or sequentially with the composition in the 
methods of the invention. Preferably for bone and/or cartilage formation, the composition 
would include a matrix capable of delivering the protein-containing or other active 

30 ingredient-containing composition to the site of bone and/or cartilage damage, providing a 
structure for the developing bone and cartilage and optimally capable of being resorbed into 
the body. Such matrices may be formed of materials presently in use for other implanted 
medical applications. 
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The choice of matrix material is based on Incompatibility, biodegradability, 
mechanical properties, cosmetic appearance and interface properties. The particular 
application of the compositions will define the appropriate formulation. Potential matrices 
for the compositions may be biodegradable and chemically defined calcium sulfate, 
5 tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. 
Other potential materials are biodegradable and biologically well-defined, such as bone or 
dermal collagen. Further matrices are comprised of pure proteins or extracellular matrix 
components. Other potential matrices are nonbiodegradable and chemically defined, such as 
sintered hydroxyapatite, bioglass, aluminates, or other ceramics. Matrices may be comprised 

10 of combinations of any of the above-mentioned types of material, such as polylactic acid and 
hydroxyapatite or collagen and tricalcium phosphate. The bioceramics may be altered in 
composition, such as in calcium-aluminate-phosphate and processing to alter pore size, 
particle size, particle shape, and biodegradability. Presently preferred is a 50:50 (mole 
weight) copolymer of lactic acid and glycolic acid in the form of porous particles having 

15 diameters ranging from 150 to 800 microns. In some applications, it will be useful to utilize 
a sequestering agent, such as carboxymethyl cellulose or autologous blood clot, to prevent 
the protein compositions from disassociating from the matrix. 

A preferred family of sequestering agents is cellulosic materials such as 
alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose, 

20 ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose, 

hydroxypropyl-methylcellulose, and carboxyniethylcellulose, the most preferred being 
cationic salts of carboxymethylcellulose (CMC). Other preferred sequestering agents 
include hyaluronic acid, sodium alginate, poly(ethylene glycol), polyoxyethylene oxide, 
carboxyvinyl polymer and polyvinyl alcohol). The amount of sequestering agent useful 

25 herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation weight, which 
represents the amount necessary to prevent desorption of the protein from the polymer 
matrix and to provide appropriate handling of the composition, yet not so much that the 
progenitor cells are prevented from infiltrating the matrix, thereby providing the protein the 
opportunity to assist the osteogenic activity of the progenitor cells. In further compositions, 

30 proteins or other active ingredients of the invention may be combined with other agents 
beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in question. 
These agents include various growth factors such as epidermal growth factor (EGF), platelet 
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derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-p), and 
insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
5 patients for such treatment with proteins or other active ingredients of the present invention. 
The dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 
modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site 
of damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue 

10 {e.g., bone), the patient's age, sex, and diet, the severity of any infection, time of 

administration and other clinical factors. The dosage may vary with the type of matrix used 
in the reconstitution and with inclusion of other proteins in the pharmaceutical composition. 
For example, the addition of other known growth factors, such as IGF I (insulin like growth 
factor I), to the final composition, may also effect the dosage. Progress can be monitored by 

1 5 periodic assessment of tissue/bone growth and/or repair, for example, X-rays, 
histomorphometric determinations and tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other 

20 known methods for introduction of nucleic acid into a cell or organism (including, without 
limitation, in the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in 
the presence of proteins of the present invention in order to proliferate or to produce a 
desired effect on or activity in such cells. Treated cells can then be introduced in vivo for 
therapeutic purposes. 

25 . . 

4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve 
its intended purpose. More specifically, a therapeutically effective amount means an amount 
30 effective to prevent development of or to alleviate the existing symptoms of the subject 
being treated. Determination of the effective amount is well within the capability of those 
skilled in the art, especially in light of the detailed disclosure provided herein. For any 
compound used in the method of the invention, the therapeutically effective dose can be 
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estimated initially from appropriate in vitro assays. For example, a dose can be formulated in 
animal models to achieve a circulating concentration range that can be used to more 
accurately determine useful doses in humans. For example, a dose can be formulated in 
animal models to achieve a circulating'concentration range that includes the IC50 as 
5 determined in cell culture the concentration of the test compound which achieves a 
half-maximal inhibition of the protein's biological activity). Such information can be used 
to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 

1 0 efficacy of such compounds can be determined by standard pharmaceutical procedures in 
cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% 
of the population) and the ED50 (the dose therapeutically effective in 50% of the population). 
The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be 
expressed as the ratio between LD50 and ED50. Compounds which exhibit high therapeutic 

15 indices are preferred. The data obtained from these cell culture assays and animal studies 
can be used in formulating a range of dosage for use in human. The dosage of such 
compounds lies preferably within a range of circulating concentrations that include the ED 50 
with little or no toxicity. The dosage may vary within this range depending upon the dosage 
form employed and the route of administration utilized. The exact formulation, route of 

20 administration and dosage can be chosen by the individual physician in view of the patient's 
condition. See, e.g., Fingl et al., 1975, in "The Pharmacological Basis of Therapeutics", Ch. 
1 p. 1 . Dosage amount and interval may be adjusted individually to provide plasma levels of 
the active moiety which are sufficient to maintain the desired effects, or minimal effective 
concentration (MEC). The MEC will vary for each compound but can be estimated from in 

25 vitro data. Dosages necessary to achieve the MEC will depend on individual characteristics 
and route of administration. However, HPLC assays or bioassays can be used to determine 
plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 1 0-90% of 

30 the time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 
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An exemplary dosage regimen for polypeptides or other compositions of the 
invention will be in the range of about 0.01 ^ig/kg to 100 mg/kg of body weight daily, with 
the preferred dose being about 0.1 ng/kg to 25 mg/kg of patient body weight daily, varying 
in adults and children. Dosing may be once daily, or equivalent doses may be delivered at 
5 longer or shorter intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

10 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which 
may contain one or more unit dosage forms containing the active ingredient. The pack may, 
for example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser 
device may be accompanied by instructions for administration. Compositions comprising a 

1 5 compound of the invention formulated in a compatible pharmaceutical carrier may also be 
prepared, placed in an appropriate container, and labeled for treatment of an indicated 
condition. 

4.13 ANTIBODIES 

20 Also included in the invention are antibodies to proteins, or fragments of proteins of 

the invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that 
contain an antigen-binding site that specifically binds (immunoreacts with) an antigen. Such 
antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, 

25 F a b, F a y and F< a b*)2 fragments, and an F a b expression library. In general, an antibody molecule 
obtained from humans relates to any of the classes IgG, lgM, IgA, IgE and IgD, which differ 
from one another by the nature of the heavy chain present in the molecule. Certain classes 
have subclasses as well, such as IgGi, IgG2, and others. Furthermore, in humans, the light 
chain may be a kappa chain or a lambda chain. Reference herein to antibodies includes a 

30 reference to all such classes, subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or 
a portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for 
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polyclonal, and monoclonal antibody preparation. The full-length protein can be used or, 
alternatively, the invention provides antigenic peptide fragments of the antigen for use as 
immunogens. An antigenic peptide fragment comprises at least 6 amino acid residues of the 
amino acid sequence of the full length protein, such as an amino acid sequence shown in 
5 SEQ ID NO: 277-552, or 773-992, or Tables 3, 4A, 4B, 5, 6, or 8, and encompasses an 
epitope thereof such that an antibody raised against the peptide forms a specific immune 
complex with the full length protein or with any fragment that contains the epitope. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 1 5 
amino acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. 

10 Preferred epitopes encompassed by the antigenic peptide are regions of the protein that are 
located on its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a surface region of the protein, e.g., a hydrophilic region. A 
hydrophobicity analysis of the human related protein sequence will indicate which regions of 

15 a related protein are particularly hydrophilic and, therefore, are likely to encode surface 
residues useful for targeting antibody production. As a means for targeting antibody 
production, hydropathy plots showing regions of hydrophilicity and hydrophobicity may be 
generated by any method well known in the art, including, for example, the Kyte Doolittle or 
the Hopp Woods methods, either with or without Fourier transformation. See, e.g., Hopp and 

20 Woods, 1981 , Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. Mol. 
Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or 
derivatives, fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

25 thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

The term "specific for" indicates that the variable regions of the antibodies of the 
invention recognize and bind polypeptides of the invention exclusively (i.e., able to 
distinguish the polypeptide of the invention from other similar polypeptides despite sequence 

30 identity, homology, or similarity found in the family of polypeptides), but may also interact 
with other proteins (for example, S. aureus protein A or other antibodies in ELIS A 
techniques) through interactions with sequences outside the variable region of the antibodies, 
and in particular, in the constant region of the molecule. Screening assays to determine 
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binding specificity of an antibody of the invention are well known and routinely practiced in 
the art. For a comprehensive discussion of such assays, see Harlow et al. (Eds), Antibodies 
A Laboratory Manual; Cold Spring Harbor Laboratory; Cold Spring Harbor, NY (1988), 
Chapter 6. Antibodies that recognize and bind fragments of the polypeptides of the 
5 invention are also contemplated, provided that the antibodies are first and foremost specific 
for, as defined above, full-length polypeptides of the invention. As with antibodies that are 
specific for full length polypeptides of the invention, antibodies of the invention that 
recognize fragments are those which can distinguish polypeptides from the same family of 
polypeptides despite inherent sequence identity, homology, or similarity found in the family 
10 of proteins. 

Antibodies of the invention are useful for, for example, therapeutic purposes (by 
modulating activity of a polypeptide of the invention), diagnostic purposes to detect or 
quantitate a polypeptide of the invention, as well as purification of a polypeptide of the 
invention. Kits comprising an antibody of the invention for any of the purposes described 

1 5 herein are also comprehended. In general, a kit of the invention also includes a control 
antigen for which the antibody is immunospecific. The invention further provides a 
hybridoma that produces an antibody according to the invention. Antibodies of the 
invention are useful for detection and/or purification of the polypeptides of the invention. 
Monoclonal antibodies binding to the protein of the invention may be useful 

20 diagnostic agents for the immunodetection of the protein. Neutralizing monoclonal 
antibodies binding to the protein may also be useful therapeutics for both conditions 
associated with the protein and also in the treatment of some forms of cancer where 
abnormal expression of the protein is involved. In the case of cancerous cells or leukemic 
cells, neutralizing monoclonal antibodies against the protein may be useful in detecting and 

25 preventing the metastatic spread of the cancerous cells, which may be mediated by the 
protein. 

The labeled antibodies of the present invention can be used for in vitro 9 in vivo, and 
in situ assays to identify cells or tissues in which a fragment of the polypeptide of interest is 
expressed. The antibodies may also be used directly in therapies or other diagnostics. The 
30 present invention further provides the above-described antibodies immobilized on a solid 
support. Examples of such solid supports include plastics such as polycarbonate, complex 
carbohydrates such as agarose and Sepharose®, acrylic resins and such as polyacrylamide 
and latex beads. Techniques for coupling antibodies to such solid supports are well known 
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Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, WD. et al., Meth. 
Enzym. 34 Academic Press, N.Y. (1974)). The immobilized antibodies of the present 
invention can be used for in vitro y in vivo, and in situ assays as well as for immuno-affinity 
5 purification of the proteins of the present invention. 

Various procedures known within the art may be used for the production of 
polyclonal or monoclonal antibodies directed against a protein of the invention, or against 
derivatives, fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: 
A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, 
10 Cold Spring Harbor, NY, incorporated herein by reference). Some of these antibodies are 
discussed below. 

4.13.1 POLYCLONAL ANTIBODIES 

For the production of polyclonal antibodies, various suitable host animals (e.g., 

1 5 rabbit, goat, mouse or other mammal) may be immunized by one or more injections with the 
native protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated 

20 to a second protein known to be immunogenic in the mammal being immunized. Examples 
of such immunogenic proteins include but are not limited to keyhole limpet hemocyanin, 
serum albumin, bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can 
further include an adjuvant. Various adjuvants used to increase the immunological response 
include, but are not limited to, Freund's (complete and incomplete), mineral gels (e.g., 

25 aluminum hydroxide), surface-active substances (e.g., lysolecithin, pluronic polyols, 

polyanions, peptides, oil emulsions, dinitrophenol, etc.), adjuvants usable in humans such as 
Bacille Calmette-Guerin and Corynebacterium parvum, or similar immunostimulatory 
agents. Additional examples of adjuvants that can be employed include MPLTDM adjuvant 
(monophosphoryl Lipid A, synthetic trehalose dicorynomycolate). 

30 The polyclonal antibody molecules directed against the immunogenic protein can be 

isolated from the mammal (e.g., from the blood) and further purified by well known 
techniques, such as affinity chromatography using protein A or protein G, which provide 
primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific 
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antigen which is the target of the immunoglobulin sought, or an epitope thereof, may be 
immobilized on a column to purify the immune specific antibody by immunoaffinity 
chromatography. Purification of immunoglobulins is discussed, for example, by D. 
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 
5 (April 17, 2000), pp. 25-28). 

4.13.2 MONOCLONAL ANTIBODIES 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as 
used herein, refers to a population of antibody molecules that contain only one molecular 

10 species of antibody molecule consisting of a unique light chain gene product and a unique 
heavy chain gene product. In particular, the complementarity determining regions (CDRs) 
of the monoclonal antibody are identical in all the molecules of the population. MAbs thus 
contain an antigen-binding site capable of immunoreacting with a particular epitope of the 
antigen characterized by a unique binding affinity for it. 

1 5 Monoclonal antibodies can be prepared using hybridoma methods, such as those 

described by Kohler and Milstein, Nature, 256, 495 (1975). In a hybridoma method, a 
mouse, hamster, or other appropriate host animal, is typically immunized with an 
immunizing agent to elicit lymphocytes that produce or are capable of producing antibodies 
that will specifically bind to the immunizing agent. Alternatively, the lymphocytes can be 

20 immunized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells 
of human origin are desired, or spleen cells or lymph node cells are used if non-human 
mammalian sources are desired. The lymphocytes are then fused with an immortalized cell 

25 line using a suitable fusing agent, such as polyethylene glycol, to form a hybridoma cell 
(Goding, Monoclonal Antibodies: Principles and Practice, Academic Press, (1986) pp. 59- 
103). Immortalized cell lines are usually transformed mammalian cells, particularly 
myeloma cells of rodent, bovine and human origin. Usually, rat or mouse myeloma cell 
lines are employed. The hybridoma cells can be cultured in a suitable culture medium that 

30 preferably contains one or more substances that inhibit the growth or survival of the unfused, 
immortalized cells. For example, if the parental cells lack the enzyme hypoxanthine guanine 
phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the hybridonias 
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typically will include hypoxanthine, aminopterin, and thymidine ("HAT medium"), which 
substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high 
level expression of antibody by the selected antibody-producing cells, and are sensitive to a 
5 medium such as HAT medium. More preferred immortalized cell lines are murine myeloma 
lines, which can be obtained, for instance, from the Salk Institute Cell Distribution Center, 
San Diego, California and the American Type Culture Collection, Manassas, Virginia. 
Human myeloma and mouse-human heteromyeloma cell lines also have been described for 
the production of human monoclonal antibodies (Kozbor, J. Immunol., 133:3001 (1984); 

10 Brodeur et al., Monoclonal Antibody Production Techniques and Applications, Marcel 
Dekker, Inc., New York, (1987) pp. 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed 
for the presence of monoclonal antibodies directed against the antigen. Preferably, the 
binding specificity of monoclonal antibodies produced by the hybridoma cells is determined 

1 5 by immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RI A) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in 
the art. The binding affinity of the monoclonal antibody can, for example, be determined by 
the Scatchard analysis of Munson and Pollard, Anal. Biochem., 107, 220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target 

20 antigen are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procedures and grown by standard methods. Suitable culture media for this 
purpose include, for example, Dulbecco's Modified Eagle's Medium and RPM1-1640 
medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 

25 The monoclonal antibodies secreted by the subclones can be isolated or purified from 

the culture medium or ascites fluid by conventional immunoglobulin purification procedures 
such as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel 
electrophoresis, dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 

30 those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of 
the invention can be readily isolated and sequenced using conventional procedures (e.g., by 
using oligonucleotide probes that are capable of binding specifically to genes encoding the 
heavy and light chains of murine antibodies). The hybridoma cells of the invention serve as 
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a preferred source of such DNA. Once isolated, the DNA can be placed into expression 
vectors, which are then transfected into host cells such as simian COS cells, Chinese hamster 
ovary (CHO) cells, or myeloma cells that do not otherwise produce immunoglobulin protein, 
to obtain the synthesis of monoclonal antibodies in the recombinant host cells. The DNA 
5 also can be modified, for example, by substituting the coding sequence for human heavy and 
light chain constant domains in place of the homologous murine sequences (U.S. Patent No. 
4,816,567; Morrison, Nature 368, 812-13 (1994)) or by covalently joining to the 
immunoglobulin coding sequence all or part of the coding sequence for a non- 
immunoglobulin polypeptide. Such a non-immunoglobulin polypeptide can be substituted 
10 for the constant domains of an antibody of the invention, or can be substituted for the 

variable domains of one antigen-combining site of an antibody of the invention to create a 
chimeric bivalent antibody. 

4.13.3 HUMANIZED ANTIBODIES 

1 5 The antibodies directed against the protein antigens of the invention can further 

comprise humanized antibodies or human antibodies. These antibodies are suitable for 
administration to humans without engendering an immune response by the human against 
the administered immunoglobulin. Humanized forms of antibodies are chimeric 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , 

20 F(ab f )2 or other antigen-binding subsequences of antibodies) that are principally comprised 
of the sequence of a human immunoglobulin, and contain minimal sequence derived from a 
non-human immunoglobulin. Humanization can be performed following the method of 
Winter and co-workers (Jones et al., Nature, 321, 522-525 (1986); Riechmann et al., Nature, 
332, 323-327 (1988); Verhoeyen et al., Science, 239, 1 534-1536 (1988)), by substituting 

25 rodent CDRs or CDR sequences for the corresponding sequences of a human antibody. (See 
also U.S. Patent No. 5,225,539). In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
can also comprise residues that are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, the humanized antibody will comprise 

30 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework regions are those of a human immunoglobulin 
consensus sequence. The humanized antibody optimally also will comprise at least a portion 
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of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin 
(Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol., 2, 593-596 
(1992)). 

5 4.13.4 HUMAN ANTIBODIES 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from 
human genes. Such antibodies are termed "human antibodies", or "fully human antibodies" 
herein. Human monoclonal antibodies can be prepared by the trioma technique; the human 

10 B-cell hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV 
hybridoma technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human 
monoclonal antibodies may be utilized in the practice of the present invention and may be 
produced by using human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80, 

15 2026-2030) or by transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et 
al., 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227, 381 (1991 ); 
Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be made by 

20 introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in 
humans in all respects, including gene rearrangement, assembly, and antibody repertoire. 
This approach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 

25 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779- 
783 (1992)); Lonberg et al. (Nature 368, 856-859 (1994)); Morrison (Nature 368, 812-13 

(1994) ); Fishwild et al, (Nature Biotechnology 14, 845-51 (1996)); Neuberger (Nature 
Biotechnology 14, 826 (1996)); and Lonberg and Huszar (Intern. Rev. Immunol. 13, 65-93 

(1995) ). 

30 Human antibodies may additionally be produced using transgenic nonhuman animals 

that are modified so as to produce fully human antibodies rather than the animal's 
endogenous antibodies in response to challenge by an antigen. (See PCT publication 
WO94/02602). The endogenous genes encoding the heavy and light immunoglobulin chains 
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in the nonhuman host have been incapacitated, and active loci encoding human heavy and 
light chain immunoglobulins are inserted into the host's genome. The human genes are 
incorporated, for example, using yeast artificial chromosomes containing the requisite 
human DNA segments. An animal which provides all the desired modifications is then 
5 obtained as progeny by crossbreeding intermediate transgenic animals containing fewer than 
the full complement of the modifications. The preferred embodiment of such a nonhuman 
animal is a mouse, and is termed the Xenomouse™ as disclosed in PCT publications WO 
96/33735 and WO 96/34096. This animal produces B cells that secrete fully human 
immunoglobulins. The antibodies can be obtained directly from the animal after 

10 immunization with an immunogen of interest, as, for example, a preparation of a polyclonal 
antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 

1 5 example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, 
lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. 
Patent No. 5,939,598. It can be obtained by a method including deleting the J segment genes 
from at least one endogenous heavy chain locus in an embryonic stem cell to prevent 

20 rearrangement of the locus and to prevent formation of a transcript of a rearranged 
immunoglobulin heavy chain locus, the deletion being effected by a targeting vector 
containing a gene encoding a selectable marker; and producing from the embryonic stem cell 
a transgenic mouse whose somatic and germ cells contain the gene encoding the selectable 
marker. 

25 A method for producing an antibody of interest, such as a human antibody, is 

disclosed in U.S. Patent No. 5,916,771 . It includes introducing an expression vector that 
contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in 
culture, introducing an expression vector containing a nucleotide sequence encoding a light 
chain into another mammalian host cell, and fusing the two cells to form a hybrid cell. The 

30 hybrid cell expresses an antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically 
relevant epitope on an immunogen, and a correlative method for selecting an antibody that 
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binds immunospecifically to the relevant epitope with high affinity, are disclosed in PCT 
publication WO 99/53049. 

4.13.5 FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES 

5 According to the invention, techniques can be adapted for the production of 

single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent 
No. 4,946,778). In addition, methods can be adapted for the construction of F ab expression 
libraries (see e.g., Huse, et al., 1989 Science 246, 1275-1281) to allow rapid and effective 
identification of monoclonal F a b fragments with the desired specificity for a protein or 

10 derivatives, fragments, analogs or homologs thereof. Antibody fragments that contain the 
idiotypes to a protein antigen may be produced by techniques known in the art including, but 
not limited to: (i) an F^b-p fragment produced by pepsin digestion of an antibody molecule; 
(ii) an Fab fragment generated by reducing the disulfide bridges of an F^b^ fragment; (iii) an 
Fab fragment generated by the treatment of the antibody molecule with papain and a reducing 

1 5 agent and (i v) F v fragments. 

4.13.6 BISPECIFIC ANTIBODIES 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies 
that have binding specificities for at least two different antigens. In the present case, one of 
20 the binding specificities is for an antigenic protein of the invention. The second binding 
target is any other antigen, and advantageously is a cell-surface protein or receptor or 
receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 

25 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305, 537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) 
produce a potential mixture of ten different antibody molecules, of which only one has the 
correct bispecific structure. The purification of the correct molecule is usually accomplished 

30 by affinity chromatography steps. Similar procedures are disclosed in WO 93/08829, 
published 13 May 1993, and in Traunecker et al. 9 1991 EMBOJ., 10, 3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
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preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part 
of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant 
region (CHI) containing the site necessary for light-chain binding present in at least one of 
the fusions. DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the 
5 immunoglobulin light chain, are inserted into separate expression vectors, and are co- 
transfected into a suitable host organism. For further details of generating bispecific 
antibodies see, for example, Suresh et al., Methods in Enzymology, 121, 210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a 
pair of antibody molecules can be engineered to maximize the percentage of heterodimers 

10 that are recovered from recombinant cell culture. The preferred interface comprises at least 
a part of the CH3 region of an antibody constant domain. In this method, one or more small 
amino acid side chains from the interface of the first antibody molecule are replaced with 
larger side chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of identical or 
similar size to the large side chain(s) are created on the interface of the second antibody 

1 5 molecule by replacing large amino acid side chains with smaller ones (e.g. alanine or 

threonine). This provides a mechanism for increasing the yield of the heterodimer over other 
unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full-length antibodies or antibody fragments 
(e.g. F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from 

20 antibody fragments have been described in the literature. For example, bispecific antibodies 
can be prepared using chemical linkage. Brennan et al., Science 229, 81 (1985) describe a 
procedure wherein intact antibodies are proteolytically cleaved to generate F(ab')2 
fragments. These fragments are reduced in the presence of the dithiol complexing agent 
sodium arsenite to stabilize vicinal dithiols and prevent intermolecular disulfide formation. 

25 The Fab' fragments generated are then converted to thionitrobenzoate (TNB) derivatives. 
One of the Fab'-TNB derivatives is then reconverted to the Fab'-thiol by reduction with 
mercaptoethylamine and is mixed with an equimolar amount of the other Fab'-TNB 
derivative to form the bispecific antibody. The bispecific antibodies produced can be used 
as agents for the selective immobilization of enzymes. 

30 Additionally, Fab' fragments can be directly recovered from E. coli and chemically 

coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175, 217-225 (1992) 
describe the production of a fully humanized bispecific antibody F(ab*)2 molecule. Each 
Fab' fragment was separajely secreted from E. coli and subjected to directed chemical 
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coupling in vitro to form the bispecific antibody. The bispecific antibody thus formed was 
able to bind to cells overexpressing the ErbB2 receptor and normal human T cells, as well as 
trigger the lytic activity of human cytotoxic lymphocytes against human breast tumor targets. 
Various techniques for making and isolating bispecific antibody fragments directly 
5 from recombinant cell culture have also been described. For example, bispecific antibodies 
have been produced using leucine zippers. Kostelny et al., J. Immunol. 148(5), 1547-1553 
(1992). The leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' 
portions of two different antibodies by gene fusion. The antibody homodimers were reduced 
at the hinge region to form monomers and then re-oxidized to form the antibody 

10 heterodimers. This method can also be utilized for the production of antibody homodimers. 
The "diabody" technology described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90, 
6444-6448 (1993) has provided an alternative mechanism for making bispecific antibody 
fragments. The fragments comprise a heavy-chain variable domain (V H ) connected to a 
light-chain variable domain (V L ) by a linker which is too short to allow pairing between the 

15 two domains on the same chain. Accordingly, the V H and V L domains of one fragment are 
forced to pair with the complementary V L and V H domains of another fragment, thereby 
forming two antigen-binding sites. Another strategy for making bispecific antibody 
fragments by the use of single-chain Fv (sFv) dimers has also been reported. See, Gruber et 
al., J. Immunol. 152, 5368 (1994). 

20 Antibodies with more than two valencies are contemplated. For example, trispecific 

antibodies can be prepared. Tutt et al., J. Immunol. 147, 60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of 
which originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm 
of an immunoglobulin molecule can be combined with an arm which binds to a triggering 

25 molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), 
or Fc receptors for IgG (FcyR), such as FctRI (CD64), FcyRII (CD32) and FcyRIII (CD1 6) 
so as to focus cellular defense mechanisms to the cell expressing the particular antigen. 
Bispecific antibodies can also be used to direct cytotoxic agents to cells which express a 
particular antigen. These antibodies possess an antigen-binding arm and an arm which binds 

30 a cytotoxic agent or a radionuclide chelator, such as EOTUBE, DPTA, DOT A, or TETA. 
Another bispecific antibody of interest binds the protein antigen described herein and further 
binds tissue factor (TF). 
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4.13.7 HETEROCONJUGATE ANTIBODIES 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted cells 
5 (U.S. Patent No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 
92/200373; EP 03089). It is contemplated that the antibodies can be prepared in vitro using 
known methods in synthetic protein chemistry, including those involving crosslinking 
agents. For example, immunotoxins can be constructed using a disulfide exchange reaction 
or by forming a thioether bond. Examples of suitable reagents for this purpose include 
10 iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, for example, in U.S. 
Patent No. 4,676,980. 



4.13.8 EFFECTOR FUNCTION ENGINEERING 

It can be desirable to modify the antibody of the invention with respect to effector 
1 5 function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region, thereby allowing 
interchain disulfide bond formation in this region. The homodimeric antibody thus 
generated can have improved internalization capability and/or increased complement- 
mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See Caron et 
20 al., J. Exp Med., 176, 1 191-1 195 (1992) and Shopes, J. Immunol., 148, 2918-2922 (1992). 
Homodimeric antibodies with enhanced anti-tumor activity can also be prepared using 
heterobifunctional cross-linkers as described in Wolff et al. Cancer Research, 53, 2560- 
2565 (1993). Alternatively, an antibody can be engineered that has dual Fc regions and can 
thereby have enhanced complement lysis and ADCC capabilities. See Stevenson et al., 
25 Anti-Cancer Drug Design, 3, 219-230 (1989). 



4.13.9 IMMUNOCONJUGATES 

The invention also pertains to immunoconjugates comprising an antibody conjugated 
to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active 
30 toxin of bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive 
isotope (i.e., a radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have 
been described above. Enzymatically active toxins and fragments thereof that can be used 
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include diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A 
chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, 
alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins 
(PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria 
5 officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin, enomycin, and the 

tricothecenes. A variety of radionuclides are available for the production of radioconjugated 
antibodies. Examples include 212 Bi, 13I I, 13, In, 90 Y, and l86 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of 
Afunctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate 

10 (SPDP), iminothiolane (IT), bi functional derivatives of imidoesters (such as dimethyl 
adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes (such as 
glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) hexanediamine), bis- 
diazonium derivatives (such as bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates 
(such as tolyene 2,6-diisocyanate), and bis-active fluorine compounds (such as 1 ,5-difluoro- 

1 5 2,4-dinitrobenzene). For example, a ricin immunotoxin can be prepared as described in 
Vitetta et al., Science, 238: 1098 (1987). Carbon- 14-labeled l-isothiocyanatobenzyl-3- 
methyldiethylene triaminepentaacetic acid (MX-DTPA) is an exemplary chelating agent for 
conjugation of radionucleotide to the antibody. See W094/1 1026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 

20 streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

25 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention 
can be recorded on computer readable media. As used herein, "computer readable media" 
refers to any medium which can be read and accessed directly by a computer. Such media 
30 include, but are not limited to: magnetic storage media, such as floppy discs, hard disc 
storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical 
storage media such as RAM and ROM; and hybrids of these categories such as 
magnetic/optical storage media. A skilled artisan can readily appreciate how any of the 
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presently known computer readable mediums can be used to create a manufacture 
comprising computer readable medium having recorded thereon a nucleotide sequence of the 
present invention. As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the presently known 
5 methods for recording information on computer readable medium to generate manufactures 
comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means 

1 0 chosen to access the stored information. In addition, a variety of data processor programs 
and formats can be used to store the nucleotide sequence information of the present 
invention on computer readable medium. The sequence information can be represented in a 
word processing text file, formatted in commercially-available software such as WordPerfect 
and Microsoft Word, or represented in the form of an ASCII file, stored in a database 

15 application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any 
number of data processor structuring formats {e.g. text file or database) in order to obtain 
computer readable medium having recorded thereon the nucleotide sequence information of 
the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-276, or 553-772 or a 

20 representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO: 1-276, or 553-772 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 
software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 

25 software which implements the BLAST (Altschul et al., J. Mol. Biol. 2 1 5:403-410 (1 990)) 
and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase 
system is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such 
ORFs may be protein-encoding fragments and may be useful in producing commercially 
important proteins such as enzymes used in fermentation reactions and in the production of 

30 commercially useful metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the 
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present invention comprises a central processing unit (CPU), input means, output means, and 
data storage means. A skilled artisan can readily appreciate that any one of the currently 
available computer-based systems are suitable for use in the present invention. As stated 
above, the computer-based systems of the present invention comprise a data storage means 
5 having stored therein a nucleotide sequence of the present invention and the necessary 
hardware means and software means for supporting and implementing a search means. As 
used herein, "data storage means" refers to memory which can store nucleotide sequence 
information of the present invention, or a memory access means which can access 
manufactures having recorded thereon the nucleotide sequence information of the present 
10 invention. 

As used herein, "search means" refers to one or more programs which are 
implemented on the computer-based system to compare a target sequence or target structural 
motif with the sequence information stored within the data storage means. Search means are 
used to identify fragments or regions of a known sequence which match a particular target 

1 5 sequence or target motif. A variety of known algorithms are disclosed publicly and a variety 
of commercially available software for conducting search means are and can be used in the 
computer-based systems of the present invention. Examples of such software includes, but 
is not limited to, Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA 
(NPOLYPEPTIDEIA). A skilled artisan can readily recognize that any one of the available 

20 algorithms or implementing software packages for conducting homology searches can be 
adapted for use in the present computer-based systems. As used herein, a "target sequence" 
can be any nucleic acid or amino acid sequence of six or more nucleotides or two or more 
amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the 
less likely a target sequence will be present as a random occurrence in the database. The 

25 most preferred sequence length of a target sequence is from about 10 to 300 amino acids, 
more preferably from about 30 to 100 nucleotide residues. However, it is well recognized 
that searches for commercially important fragments, such as sequence fragments involved in 
gene expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 

30 selected sequence or combination of sequences in which the sequence(s) are chosen based on 
a three-dimensional configuration which is formed upon the folding of the target motif. 
There are a variety of target' motifs known in the art. Protein target motifs include, but are 
not limited to, enzyme active sites and signal sequences. Nucleic acid target motifs include, 
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but are not limited to, promoter sequences, hairpin structures and inducible expression 
elements (protein binding sequences). 

4.15 TRIPLE HELIX FORMATION 

5 In addition, the fragments of the present invention, as broadly described, can be used 

to control gene expression through triple helix formation or antisense DNA or RNA, both of 
which methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and 
are designed to be complementary to a region of the gene involved in transcription (triple 

10 helix-see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et aL, Science 15241, 456 
(1988); and Dervan et al;, Science 251, 1360 (1991)) or to the mRNA itself (antisense- 
Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of 
Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally 
results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization 

15 blocks translation of an mRNA molecule into polypeptide. Both techniques have been 

demonstrated to be effective in model systems. Information contained in the sequences of 
the present invention is necessary for the design of an antisense or triple helix 
oligonucleotide. 

20 4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression 
of one of the ORFs of the present invention, or homolog thereof, in a test sample, using a 
nucleic acid probe or antibodies of the present invention, optionally conjugated or otherwise 
associated with a suitable label. 

25 In general, methods for detecting a polynucleotide of the invention can comprise 

contacting a sample with a compound that binds to and forms a complex with the 
polynucleotide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polynucleotide of the invention is detected in the sample. 
Such methods can also comprise contacting a sample under stringent hybridization 

30 conditions with nucleic acid primers that anneal to a polynucleotide of the invention under 
such conditions, and amplifying annealed polynucleotides, so that if a polynucleotide is 
amplified, a polynucleotide of the invention is detected in the sample. 
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In general, methods for detecting a polypeptide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the 
polypeptide for a period sufficient to form the complex, and detecting the complex, so that if 
a complex is detected, a polypeptide of the invention is detected in the sample. 
5 In detail, such methods comprise incubating a test sample with one or more of the 

antibodies or one or more of the nucleic acid probes of the present invention and assaying 
for binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 

10 employed, and the type and nature of the nucleic acid probe or antibody used in the assay. 
One skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic 
acid probes or antibodies of the present invention. Examples of such assays can be found in 
Chard, T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science 

15 Publishers, Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in 

Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 
(1985); Tijssen, P., Practice and Theory of immunoassays: Laboratory Techniques in 
Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam, The 
Netherlands (1985). The test samples of the present invention include cells, protein or 

20 membrane extracts of cells, or biological fluids such as sputum, blood, serum, plasma, or 
urine. The test sample used in the above-described method will vary based on the assay 
format, nature of the detection method and the tissues, cells or extracts used as the sample to 
be assayed. Methods for preparing protein extracts or membrane extracts of cells are well 
known in the art and can be readily be adapted in order to obtain a sample which is 

25 compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the 
invention provides a compartment kit to receive, in close confinement, one or more 
containers which comprises: (a) a first container comprising one of the probes or antibodies 

30 of the present invention; and (b) one or more other containers comprising one or more of the 
following: wash reagents, reagents capable of detecting presence of a bound probe or 
antibody. 
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In detail, a compartment kit includes any kit in which reagents are contained in 
separate containers. Such containers include small glass containers, plastic containers or 
strips of plastic or paper. Such containers allows one to efficiently transfer reagents from 
one compartment to another compartment such that the samples and reagents are not 
5 cross-contaminated, and the agents or solutions of each container can be added in a 
quantitative fashion from one compartment to another. Such containers will include a 
container which will accept the test sample, a container which contains the antibodies used 
in the assay, containers which contain wash reagents (such as phosphate buffered saline, 
Tris-buffers, etc.), and containers which contain the reagents used to detect the bound 

10 antibody or probe. Types of detection reagents include labeled nucleic acid probes, labeled 
secondary antibodies, or in the alternative, if the primary antibody is labeled, the enzymatic, 
or antibody binding reagents which are capable of reacting with the labeled antibody. One 
skilled in the art will readily recognize that the disclosed probes and antibodies of the present 
invention can be readily incorporated into one of the established kit formats which are well 

15 known in the art. 

4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
20 invention is involved in the immune response, for imaging sites of inflammation or 
infection). See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve 
chemical attachment of a labeling or imaging agent, administration of the labeled 
polypeptide to a subject in a pharmaceutical^ acceptable carrier, and imaging the labeled 
polypeptide in vivo at the target site. 

25 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present 
invention further provides methods of obtaining and identifying agents which bind to a 
polypeptide encoded by an ORF corresponding to any of the nucleotide sequences set forth 
30 in SEQ ID NO: 1-276, or 553-772, or bind to a specific domain of the polypeptide encoded 
by the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the 
present invention, or nucleic acid of the invention; and 
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(b) determining whether the agent binds to said protein or said nucleic acid. 

In general, therefore, such methods for identifying compounds that bind to a 
polynucleotide of the invention can comprise contacting a compound with a polynucleotide 
of the invention for a time sufficient to form a polynucleotide/compound complex, and 
5 detecting the complex, so that if a polynucleotide/compound complex is detected, a 
compound that binds to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to 
a polypeptide of the invention can comprise contacting a compound with a polypeptide of 
the invention for a time sufficient to form a polypeptide/compound complex, and detecting 
1 0 the complex, so that if a polypeptide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can 
also comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression 
15 of a receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound 
that binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
20 activity observed in the absence of the compound). Alternatively, compounds identified via 
such methods can include compounds which modulate the expression of a polynucleotide of 
the invention (that is, increase or decrease expression relative to expression levels observed 
in the absence of the compound). Compounds, such as compounds identified via the 
methods of the invention, can be tested using standard assays well known to those of skill in 
25 the art for their ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be 
selected and screened at random or rationally selected or designed using protein modeling 
techniques. 

30 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents 

and the like are selected at random and are assayed for their ability to bind to the protein 
encoded by the ORF of the present invention. Alternatively, agents may be rationally 
selected or designed. As used herein, an agent is said to be "rationally selected or designed" 
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when the agent is chosen based on the configuration of the particular protein. For example, 
one skilled in the art can readily adapt currently available procedures to generate peptides, 
pharmaceutical agents and the like, capable of binding to a specific peptide sequence, in 
order to generate rationally designed antipeptide peptides, for example see Hurby et al, 
5 Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's 
Guide, W.H. Freeman, NY (1992), pp. 289-307, and Kaspczak et ah, Biochemistry 
28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or 

10 EMFs of the present invention. As described above, such agents can be randomly screened 
or rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single 
ORF or multiple ORFs which rely on the same EMF for expression control. One class of 
DNA binding agents are agents which contain base residues which hybridize or form a triple 

1 5 helix formation by binding to DNA or RNA. Such agents can be based on the classic 

phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric 
derivatives which have base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - 

20 see Lee et aL, Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 241, 456 (1988); and 
Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense-Okano, J. 
Neurochem. 56, 560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene 
Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally results in 
a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks 

25 translation of an mRNA molecule into polypeptide. Both techniques have been 

demonstrated to be effective in model systems. Information contained in the sequences of 
the present invention is necessary for the design of an antisense or triple helix 
oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention 

30 can be used as a diagnostic agent. Agents which bind to a protein encoded by one of the 
ORFs of the present invention can be formulated using known techniques to generate a 
pharmaceutical composition. 
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4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic 
acid hybridization probes capable of hybridizing with naturally occurring nucleotide 
sequences. The hybridization probes of the subject invention may be derived from any of 
5 the nucleotide sequences SEQ ID NO: 1-276, or 553-772. Because the corresponding gene 
is only expressed in a limited number of tissues, a hybridization probe derived from any of 
the nucleotide sequences SEQ ID NO: 1-276, or 553-772 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 

10 hybridization. PCR as described in US Patents Nos. 4,683,1 95 and 4,965,1 88 provides 

additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used 
in PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. 
The probe will comprise a discrete nucleotide sequence for the detection of identical 
sequences or a degenerate pool of possible sequences for identification of closely related 

1 5 genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such 
vectors are known in the art and are commercially available and may be used to synthesize 
RNA probes in vitro by means of the addition of the appropriate RNA polymerase as T7 or 

20 SP6 RNA polymerase and the appropriate radioactively labeled nucleotides. The nucleotide 
sequences may be used to construct hybridization probes for mapping their respective 
genomic sequences. The nucleotide sequence provided herein may be mapped to a 
chromosome or specific regions of a chromosome using well-known genetic and/or 
chromosomal mapping techniques. These techniques include in situ hybridization, linkage 

25 analysis against known chromosomal markers, hybridization screening with libraries or 
flow-sorted chromosomal preparations specific to known chromosomes, and the like. The 
technique of fluorescent in situ hybridization of chromosome spreads has been described, 
among other places, in Verma et al ( 1 988) Human Chromosomes: A Manual of Basic 
Techniques, Pergamon Press, New York NY. 

30 Fluorescent in situ hybridization of chromosomal preparations and other physical 

chromosome mapping techniques may be correlated with additional genetic map data. 
Examples of genetic map data can be found in the 1994 Genome Issue of Science 
(265: 198 If). Correlation between the location of a nucleic acid on a physical chromosomal 
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map and a specific disease (or predisposition to a specific disease) may help delimit the 
region of DNA associated with that genetic disease. The nucleotide sequences of the subject 
invention may be used to detect differences in gene sequences between normal, carrier or 
affected individuals. 

5 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly 
practiced using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those 

1 0 of skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy 
is to precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can 
be achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. Microbiol. 28(6), 1469- 
72); using UV light (Nagata et al t 1985; Dahlen et al, 1987; Morrissey & Collins, (1989) Mol. 
Cell Probes 3(2) 189-207) or by covalent binding ofbase modified DNA (Keller al. 1988; 

1 5 1 989); all references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude etal. (1994) Proc. Natl. Acad. Sci. USA 91(8), 
3072-6, describe the use of biotinylated probes, although these are duplex probes, that are 
immobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads may be 

20 purchased from Dynal, Oslo. Of course, this same linking chemistry is applicable to coating 
any surface with streptavidin. Biotinylated probes may be purchased from various sources, 
such as, e.g., Operon Technologies (Alameda, CA). 

Nunc Laboratories (Naperville, EL) is also selling suitable material that could be used. 
Nunc Laboratories have developed a method by which DNA can be covalently bound to the 

25 microwell surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with 
secondary amino groups (>NH) that serve as bridgeheads for further covalent coupling. 
CovaLink Modules may be purchased from Nunc Laboratories. DNA molecules may be bound 
to CovaLink exclusively at the 5*-end by a phosphoramidate bond, allowing immobilization of 
more than 1 pmol of DNA (Rasmussen et al. (1991) Anal. Biochem. 198(1) 138-42). 

30 The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end 

has been described (Rasmussen et al., (1991). In this technology, a phosphoramidate bond is 
employed (Chu et al., (1983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as 
immobilization using only a single covalent bond is preferred. The phosphoramidate bond joins 
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the DNA to the CovaLink NH secondary amino groups that are positioned at the end of spacer 
arms covalently grafted onto the polystyrene surface through a 2 run long spacer arm. To link 
an oligonucleotide to CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus 
must have a 5'-end phosphate group. It is, perhaps, even possible for biotin to be covalently 
5 bound to CovaLink and then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/|il) and 
denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 1- 
methylimidazole, pH 7.0 (1-Melm 7 ), is then added to a final concentration of 10 mM 1-Melm7. 
A ss DNA solution is then dispensed into CovaLink NH strips (75 nl/well) standing on ice. 

1 0 Carbodiimide 0.2 M 1 -ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC), 

dissolved in 10 mM 1-Melm 7 , is made fresh and 25 nl added per well. The strips are incubated 
for 5 hours at 50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; 
first the wells are washed 3 times, then they are soaked with washing solution for 5 min., and 
finally they are washed 3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS 

15 heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is 
that described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated 
herein by reference. This method of preparing an oligonucleotide bound to a support involves 
attaching a nucleoside 3-reagent through the phosphate group by a covalent phosphodiester link 

20 to aliphatic hydroxyl groups carried by the support. The oligonucleotide is then synthesized on 
the supported nucleoside and protecting groups removed from the synthetic oligonucleotide 
chain under standard conditions that do not cleave the oligonucleotide from the support. 
Suitable reagents include nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 

25 arrays may be employed. For example, addressable laser-activated photodeprotection may be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described 
by Fodor et al. (1991) Science 251(4995), 767-73, incorporated herein by reference. Probes 
may also be immobilized on nylon supports as described by Van Ness et al ( 1 99 1 ) Nucleic 
Acids Res., 19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1 988) 

30 Anal. Biochem. 169(1), 104-8; all references being specifically incorporated herein. 

To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5'-amine of 
oligonucleotides with cyanuric chloride. 
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One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et aL (1994) Proc. Natl Acad. Sci., USA91(11), 
5022-6, incorporated herein by reference). These authors used current photolithographic 
techniques to generate arrays of immobilized oligonucleotide probes (DNA chips). These 
5 methods, in which light is used to direct the synthesis of oligonucleotide probes in high-density, 
miniaturized arrays, utilize photolabile 5-protectedN-acyl-deoxynucleoside phosphoramidites, 
surface linker chemistry and versatile combinatorial synthesis strategies. A matrix of 256 
spatially defined oligonucleotide probes may be generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

10 The nucleic acids may be obtained from any appropriate source, such as cDNAs, 

genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC 

inserts, and RNA, including mRNA without any amplification steps. For example, Sambrook 

et aL (1989) describes three protocols for the isolation of high molecular weight DNA from 

mammalian cells (p. 9. 14-9.23). 
1 5 DNA fragments may be prepared as clones in M 1 3, plasmid or lambda vectors and/or 

prepared directly from genomic DNA or cDNA by PCR or other amplification methods. 

Samples may be prepared or dispensed in multi well plates. About 1 00- 1 000 ng of DNA 

samples may be prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of 
20 skill in the art including, for example, using restriction enzymes as described at 9.24-9.28 of 

Sambrook et al (1989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1990) 

Nucleic Acids Res. 18(24), 7455-6, incorporated herein by reference). In this method, DNA 

samples are passed through a small French pressure cell at a variety of low to intermediate 
25 pressures. A lever device allows controlled application of low to intermediate pressures to the 

cell. The results of these studies indicate that low-pressure shearing is a useful alternative to 

sonic and enzymatic DNA fragmentation methods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the 

two base recognition endonuclease, Cv/JI, described by Fitzgerald et al. (1992) Nucleic Acids 
30 Res. 20(14) 3753-62. These authors described an approach for the rapid fragmentation and 

fractionation of DNA into particular sizes that they contemplated to be suitable for shotgun 

cloning and sequencing. 
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The restriction endonuclease CviJI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the 
specificity of this enzyme (Cv/JI**), yield a quasi-random distribution of DNA fragments form 
the small molecule pUC19 (2688 base pairs). Fitzgerald et al (1992) quantitatively evaluated 
5 the randomness of this fragmentation strategy, using a Cv/JI** digest of pUCl 9 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z 
minus M 13 cloning vector. Sequence analysis of 76 clones showed that Cv/JI** restricts 
pyGCPy and PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated 
at a rate consistent with random fragmentation. 

1 0 As reported in the literature, advantages of this approach compared to sonication and 

agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 jig instead of 
2-5 ng); and fewer steps are involved (no preligation, end repair, chemical extraction, or 
agarose gel electrophoresis and elution are needed). 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, 

15 it is important to denature the DNA to give single stranded pieces available for hybridization. 
This is achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is 
then cooled quickly to 2°C to prevent renaturation of the DNA fragments before they are 
contacted with the chip. Phosphate groups must also be removed from genomic DNA by 
methods known in the art. 

20 4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon 
membrane. Spotting may be performed by using arrays of metal pins (the positions of which 
correspond to an array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a 
DNA solution to a nylon membrane. By oflfeet printing, a density of dots higher than the density 

25 of the wells is achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the 
type of label used. By avoiding spotting in some preselected number of rows and columns, 
separate subsets (subarrays) may be formed. Samples in one subarray may be the same genomic 
segment of DNA (or the same gene) from different individuals, or may be different, overlapped 
genomic clones. Each of the subarrays may represent replica spotting of the same samples. In 

30 one example, a selected gene segment may be amplified from 64 patients. For each patient, the 
amplified gene segment may be in one 96-well plate (all 96 wells containing the same sample). 
A plate for each of the 64 patients is prepared. By using a 96-pin device, all samples may be 
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spotted on one 8 x 12 cm membrane. Subarrays may contain 64 samples, one from each patient. 
Where the 96 subarrays are identical, the dot span may be 1 mm 2 and there may be a 1 mm 
space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, 
5 Illinois) which may be partitioned by physical spacers e.g. a plastic grid molded over the 
membrane, the grid being similar to the sort of membrane applied to the bottom of multiwell 
plates, or hydrophobic strips. A fixed physical spacer is not preferred for imaging by exposure 
to flat phosphor-storage screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of 

1 0 the present disclosure, one of skill in the art will appreciate that many other embodiments and 
variations may be made in the scope of the present invention. Accordingly, it is intended that 
the broader aspects of the present invention not be limited to the disclosure of the following 
examples. The present invention is not to be limited in scope by the exemplified embodiments 
which are intended as illustrations of single aspects of the invention, and compositions and 

1 5 methods which are functionally equivalent are within the scope of the invention. Indeed, 

numerous modifications and variations in the practice of the invention are expected to occur to 
those skilled in the art upon consideration of the present preferred embodiments. Consequently, 
the only limitations which should be placed upon the scope of the invention are those which 
appear in the appended claims. 

20 All references cited within the body of the instant specification are hereby incorporated 

by reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 
25 A plurality of novel nucleic acids were obtained from cDNA libraries prepared from 

various human tissues and in some cases isolated from a genomic library derived from human 
chromosome using standard PCR, SBH sequence signature analysis and Sanger sequencing 
techniques. The inserts of the library were amplified with PCR using primers specific for the 
vector sequences which flank the inserts. Clones from cDNA libraries were spotted on nylon 
30 membrane filters and screened with oligonucleotide probes (e.g., 7-mers) to obtain signature 
sequences. The clones were clustered into groups of similar or identical sequences. 
Representative clones were selected for sequencing. 
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In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied 
Biosystems (ABI) sequencer to obtain the novel nucleic acid sequences. 

5 5.2 EXAMPLE 2 

Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 553- 
772 were assembled using an EST sequence as a seed. Then a recursive algorithm was used to 
extend the seed EST into an extended assemblage, by pulling additional sequences from 

10 different databases (i.e., Hyseq's database containing EST sequences, dbEST, gb pri, and 
UniGene, and exons from public domain genomic sequences predicated by GenScan) that 
belong to this assemblage. The algorithm terminated when there were no additional sequences 
from the above databases that would extend the assemblage. Further, inclusion of component 
sequences into the assemblage was based on a BLASTN hit to the extending assemblage with 

1 5 BLAST score greater than 300 and percent identity greater than 95%. 

The novel predicted polypeptides (including proteins) encoded by the novel 
polynucleotides (SEQ ID NO: 553-772) of the present invention, and their corresponding 
translation start and stop nucleotide locations to each of SEQ ID NO: 553-772 were obtained 
using one of two methods. Polypeptides were obtained by using a software program called 

20 FASTY (available from http://fasta.bioch.virginia.edu) which selects a polypeptide based on a 
comparison of the translated novel polynucleotide to known polynucleotides (W.R. Pearson, 
Methods in Enzymology, 183:63-98 (1990), herein incorporated by reference). Alternatively, 
polypeptides were obtained by using a software program called GenScan for human/vertebrate 
sequences (available from Stanford University, Office of Technology Licensing) that predicts 

25 the polypeptide based on a probabilistic model of gene structure/compositional properties (C. 
Burge and S. Karlin, J. Mol. Biol., 268:78-94 (1997), incorporated herein by reference). 
Method C refers to a polypeptide obtained by using a Hyseq proprietary software program that 
translates the novel polynucleotide and its complementary strand into six possible amino acid 
sequences (forward and reverse frames) and chooses the polypeptide with the longest open 

30 reading frame. 



5 J EXAMPLE 3 
Novel Nucleic Acids 
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The novel nucleic acids of the present invention were assembled from sequences that 
were obtained from a cDNA library by methods described in Example 1 above, and in some 
cases sequences obtained from one or more public databases. The nucleic acids were 
assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend the 
5 seed EST into an extended assemblage, by pulling additional sequences from different 
databases (Hyseq's database containing EST sequences, dbEST, gb pri, and UniGene) that 
belong to this assemblage. The algorithm terminated when there was no additional sequences 
from the above databases that would extend the assemblage. Inclusion of component sequences 
into the assemblage was based on a BLASTN hit to the extending assemblage with BLAST 

1 0 score greater than 300 and percent identity greater than 95%. 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full-length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any 
frame shifts and incorrect stop codons were corrected by hand editing. During editing, the 
sequences were checked using FASTY and/or BLAST against Genebank (i.e., dbEST, gb pri, 

15 UniGene, and Genpept) and the Geneseq (Derwent). Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) 
and ed-ready, ed-ext and cg-zip-2 (Hyseq, Inc.). The full-length nucleotide and amino acid 
sequences, including splice variants resulting from these procedures are shown in the Sequence 
Listing as SEQ ID NO: 1-552. 

20 The nucleic acid sequences of the present invention were confirmed to have at least 

one transmembrane domain using the TMpred program 

(http://www.ch.embnet.org/software/TMPRED form.html. herein incorporated by 
reference). 

Table 1 shows the various tissue sources of SEQ ID NO: 1-276. 

25 The homologs for polypeptides SEQ ID NO: 277-552, that correspond to nucleotide 

sequences SEQ ID NO: 1-276 were obtained by a BLASTP search against Genpept release 
124 and Geneseq (Derwent) release 2001 17 and against Genpept release 129 and Geneseq 
(Derwent) release (July 18, 2002). The results showing homologues for SEQ ID NO: 277- 
552 from Genpept 124 are shown in Table 2A. The results showing homologues for SEQ ID 

30 NO: 277-552 from Genpept 129 are shown in Table 2B. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. 
Comp. Biol., Vol. 6, 219-235 (1999), http://motif.stanford.edu/ematrix-search/ herein 
incorporated by reference), all the polypeptide sequences were examined to determine 
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whether they had identifiable signature regions. Scoring matrices of the eMatrix software 
package are derived from the BLOCKS, PRINTS, PFAM, PRODOM, and DOMO 
databases. Table 3 shows the accession number of the homologous eMatrix signature found 
in the indicated polypeptide sequence, its description, and the results obtained which include 
5 accession number subtype; raw score; p-value; and the position of signature in amino acid 
sequence. 

Using the Pfam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 
26(1) pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences 
were examined for domains with homology to certain peptide domains. Table 4A shows the 

10 name of the Pfam model found, the description, the e- value and the Pfam score for the 

identified model within the sequence as described in United States priority application serial 
number 60/323,739, filed September 19, 2001, herein incorporated by reference in its 
entirety. Table 4B shows the name of the Pfam model found, the description, the e-value 
and the Pfam score for the identified model within the sequence using Pfam version 7.2. 

1 5 Further description of the Pfam models can be found at http://pfam.wustl.edu/ . 

The GeneAtlas™ software package (Molecular Simulations Inc. (MSI), San Diego, 
CA) was used to predict the three-dimensional structure models for the polypeptides 
encoded by SEQ ID NO: 1-276 (i.e. SEQ ID NO: 277-552). Models were generated by (1) 
PSI-BLAST which is a multiple alignment sequence profile-based searching developed by 

20 Altschul et al, (Nucl. Acids. Res. 25, 3389-3408 (1997)), (2) High Throughput Modeling 

(HTM) (Molecular Simulations Inc. (MSI) San Diego, CA,) which is an automated sequence 
and structure searching procedure (http://www.msi.comA . and (3) SeqFold™ which is a fold 
recognition method described by Fischer and Eisenberg (J. Mol. Biol. 209, 779-791 (1998)). 
This analysis was carried out, in part, by comparing the polypeptides of the invention with 

25 the known NMR (nuclear magnetic resonance) and x-ray crystal three-dimensional structures 
as templates. Table 5 shows: "PDB ID", the Protein DataBase (PDB) identifier given to 
template structure; "Chain ID", identifier of the subcomponent of the PDB template 
structure; "Compound Information", information of the PDB template structure and/or its 
subcomponents; "PDB Function Annotation" gives function of the PDB template as 

30 annotated by the PDB files (http:Ayww.rcsb.org/PDB/) : start and end amino acid position of 
the protein sequence aligned; PSI-BLAST score, the verify score, the SeqFold score, and the 
Potential (s) of Mean Force (PMF). The verily score is produced by GeneAtlas™ software 
(MSI), is based on Dr. Eisenberg's ProfiIe-3D threading program developed in Dr. David 
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Eisenberg's laboratory (US patent no. 5,436,850 and Luthy, Bowie, and Eisenberg, Nature, 
356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc. Natl. Acad. Sci. USA, 
95:13597-12502. The verify score produced by GeneAtlas normalizes the verify score for 
proteins with different lengths so that a unified cutoff can be used to select good models as 
5 follows: 

Verify score (normalized) = (raw score - 1/2 high score)/(l/2 high score) 

The PFM score, produced by GeneAtlas™ software (MSI), is a composite scoring 
1 0 function that depends in part on the compactness of the model, sequence identity in the 
alignment used to build the model, pairwise and surface mean force potentials (MFP). As 
given in Table 5, a verify score between 0 to 1.0, with 1 being the best, represents a good 
model. Similarly, a PMF score between 0 to 1.0, with 1 being the best, represents a good 
model. A SeqFold™ score of more than 50 is considered significant. A good model may 
1 5 also be determined by one of skill in the art based all the information in Table 5 taken in 
totality. 

Table 6 shows the position of the signal peptide in each of the polypeptides and the 
maximum score and mean score associated with that signal peptide using Neural Network 
SignalP VI .1 program (from Center for Biological Sequence Analysis, The Technical 

20 University of Denmark). The process for identifying prokaryotic and eukaryotic signal 
peptides and their cleavage sites are also disclosed by Henrik Nielson, Jacob Engelbrecht, 
Soren Brunak, and Gunnar von Heijne in the publication " Identification of prokaryotic and 
eukaryotic signal peptides and prediction of their cleavage sites" Protein Engineering, Vol. 
10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score and a mean 

25 S score, as described in the Nielson et al reference, was obtained for the polypeptide 
sequences. 

Table 7 correlates each of SEQ ID NO: 1-276 to a specific chromosomal location. 

Table 8 shows the number of transmembrane regions, their location(s), and TMPred 
score obtained, for each of the SEQ ID NO: 277-552 that had a TMPred score of 500 or 
30 greater, using the TMpred program 

fhttp://www,ch.embnet.org/soflware/TMPRED form.htmn . 

Table 9 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 1- 
276, their corresponding polypeptide sequences SEQ ID NO: 277-552, their corresponding 
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priority contig nucleotide sequences SEQ ID NO: 553-772, their corresponding priority 
contig polypeptide sequences SEQ ID NO: 773-992, and the US serial number of the priority 
application (all of which are herein incorporated in their entirety), in which the contig 
sequence was filed. 

5 Table 1 0 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 1 - 

276, the novel polypeptide sequences SEQ ID NO: 277-552, and the corresponding SEQ ID 
NO in which the sequence was filed in priority US application bearing serial number 
60/323,739, filed September 19, 2001. 
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Table 1 



Tissue origin 


Library/RNA 
source 


HYSEQ Library 
Name 


SEQ ID NO: 


adult brain 


G1BCO 


AB300I 


8 76 78 80 101-102 109-111 113 153 194 

205 265 


adult brain 


GIBCO 


ABD003 


1-3 8-9 1 1 14 23 29 41 76 78 84 89 93 95 
104-106 109-111 113-114 126-127 136- 
139 151-152 162 164-166 176 178 181 
211 224 263 


adult brain 


Clontech 


ABR001 


23 38-39 47 91 103 106 139 143 171 224 
235 244 


adult brain 


Clontech 


ABR006 


1-3 8-9 22 29-30 36 38-39 41 51-53 66 76 
79 88 91 93 101-102 1 13 121 123 126-127 
133-134 139 147 161-162 170 186 192 
198 202-203 21 1 219 221 225 232 234 
252 262-263 271 275 


adult brain 


Clontech 


ABR008 


1-3 6 9-11 13 15 24 30-31 33 36 38-39 41 
44 46-47 55-56 61-65 74 76 80-81 87 93 
95 99-102 104-106109-110114-115 122- 
123 127-128 138-140 143 154-155 164- 
167 169-170 172-174 178 186 188 190 
199-200 202-206 211 213 217-219 221- 
222 230 232 234 242-243 245 252 263 
271 276 


adult brain 


BioChain 


ABR012 


5 28 161 21 1 


adult brain 


BioChain 


ABR013 


144 154 


adult brain j 


Invitrogen 


ABR014 


76 115 


adult brain 


invitrogen 


ABR015 


13 15 178211 


adult brain 


Invitrogen 


ABR016 


37 95 101-102 


adult brain 


Invitrogen 


ABT004 


6 23 47 79 101-103 106 109-1 10 1 13 1 15 
137 154 158 171-173 176 189-190 192- 
193 199 231 269 271 


cultured 
preadipocytes 


Stratagene 


ADP001 


4 26 33 81-83 86 99-102 114-115 132 154 
181 193 


adrenal gland 


Clontech 


ADR002 


9 13 32 40-41 57 72 76 84 93 103-105 115 
120 122 126 133 138 140 155 157 164- 
166 171 187 194 199-200 209 211 220 
224-225 264 


adult heart 


GIBCO 


AHR001 


1-3 5-6 8 1 1-12 14 21 26 28 41 55 87 99- 
104 106 109-110 113 115 118 120 124- 
125 132 136 139 145 153-154 158 160 
169 180 195 198 200 21 1 253 267 


adult kidney 


GIBCO 


AKD001 


1-7 15-16 19-21 28 42 57 60 84 87 91 95 
101-102 104-105 107 113 115 121-123 
126 129 132-133 137-138 140-144 149 1 
151-152 155-156 159 163-167 178 194 
198 205 211213 230 235 242 253 261 265 


adult kidney 


Invitrogen 


AKT002 


1-4 6 15 20-21 41 43 45-46 60 90 101-102 

1 AC 1 t\4L 1 AO 111 \IA 11C m 11/1 1 "J "7 

1 ID- 1 UO I Uo 111 1 14-1 ID \Z\ 1 34 I J / 

143 151-154 157 163 178 198 205 213 
223-224 230246 265 


adult lung 


GIBCO 


ALG001 


5 24 72 78 136 158 164-166 168 267 270 


lymph node 


Clontech 


ALN001 


64 121 154 216 235 


young liver 


GIBCO 


ALV001 


1-3 5 28 101-102 104 122 125 132 164- 
166 172 178 201 213 220 224 | 


adult liver 


Invitrogen 


ALV002 


15-16 2642 47 51-53 58 6075 84 87 101- 
102 104 109-1 10 1 12 1 14-1 15 138 143 
154 164-166 172 178 195 199 207 236 
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Table 1 



Tissue origin 


Library/RNA 
source 


HYSEQ Library 
Name 


SEQ ID NO: 








252 254 


adult liver 


Clontech 


ALV003 


1-3 104 115 120 169 172 j 


adult ovary 


Invitrogen 


AOV001 


1-5 21-22 26 28-29 32 38-39 41 48 78 84 
86-87 95 99-102 104 106-1 1 1 1 13-1 15 
118 120-121 126 131-134 136 138 145- 
146 149-150 153-154 157-158 160 163 
168-171 180 186-188 192 194 198-199 
201 209 21 1 214 216 224-225 231 242 
246 253 265 


adult placenta 


Clontech 


APL001 


1646 136 


placenta 


Invitrogen 


APL002 


4 26 47 60 101-102 109-110 143 153 164- 
166 178 242 


adult spleen 


GIBCO 


ASP001 


1-3 6 15 17 72 82-83 101-102 104 109- 
110 118 121 129 132 136 158 178 181 198 
238 240 


adult testis 


GIBCO 


ATS001 


1-3 6 13 21 60 80 137 145 150 158 171 
247 


adult bladder 


Invitrogen 


BLD001 


6 94 114 164-166 169 178 188 190 200 

252 


bone marrow 


Clontech 


BMD001 


1-3 1 1-14 29 86 99-100 103-106 111113 
121-124 134 147-148 197-198 211 213 
225 230 253-254 264 


bone marrow 


GF 


BMD002 


6 9 13 22 32 51-53 55 60 74 82-83 93 95 
99-105 108-110 113 122-123 129 131 139 
143 147 153 159 161 164-166 178 186 
190 21 1 221 224 230 234 246 248 250 
253-254 


adult colon 


Invitrogen 


CLN001 


47 60158 173 181 201 211 


adult cervix 


BioChain 


CVX001 


1-3 8 14 29 38-39 41-42 51-53 72 78-80 
84 86-87 97 99-100 104 106-107 111 113 
115 121-122 124 132-134 136 138 143 
145 153-155 178 181 188 195 198-199 
209 21 1 223 225 240 242 252-253 267 


diaphragm 


BioChain 


D1A002 


182 


endothelial cells 


Stratagene 


EDT001 


4-5 15-16 26 28-29 36 47 51-53 57 60 78 
99-102 104-105 107 109-110 113 115 121 
123 131-132 136 138 144 150 154 158 
164-166 171 178 198 201 213 224 235 
251-252 


fetal brain 


Clontech 


FBR001 


1-3 31 42 76 79 137 154 


fetal brain 


Clontech 


FBR004 


36 79 154 j 


fetal brain 


Clontech 


FBR006 


5 10-1 1 13 15 24-25 30-33 38-39 41-42 47 
62-64 76 78 80-81 95 99-102 104-105 
109-110115 117-118 122-123 126-128 
131 133 138 143 147 154 167 173 175 178 
188 194 199-200 202-204 206-207 21 1 
218 222 234-235 244-245 252 262 266 
271-272 275 


fetal brain 


Clontech 


FBRs03 


5 28 | 


fetal brain 


Invitrogen 


FBT002 


6 15 24 35-36 41 64 101-102 1 13 127 137 
144 153-154 162 178 192 194 216 


fetal heart 


Invitrogen 


FHR001 


6 14-15 21 30 46 51-53 68 80-81 87 95 
101-102 106-107 109-110113 115 118 
122 136 139 145 178 188 196-197 199- 
201 211 214 253 256-257 261 
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Table 1 



Tissue origin 


Library/RNA 
source 


HYSEQ Library 
Name 


SEQ ID NO: 


fetal kidney 


Clontech 


FKD00I 


1-3 6 105 109-110 178 198 265 


fetal kidney 


Clontech 


FKD002 


10 46 57 107 113 118 154-155 161 186 
205 221 253 267 


fetal lung 


Clontech 


FLG001 


9 13 121 132 136 161 181 184 192 231 


fetal lung 


Invitrogen 


FLG003 


6 15 19 60 89 107 1 1 1 1 13 147 154 158 
164-166190 224 238 242 


fetal lung 


Clontech 


FLG004 


99-100 


fetal liver- 
spleen 


Columbia 
University 


FLS001 


1-7 9 11 17 26 28-29 38-39 41 48 51-53 . 
57-60 72 74 76 84 90-91 93-95 97-102 
104-110 112-122 126 132-133 135-136 
138 143 149-150 153 159 161 167 172 
178 181 191 194 198 200-203 21 1 213 
220 230 238 242 263 265 


fetal liver- 
spleen 

— - — — — 


Columbia 
University 


FLS002 


5-6 9 11 15 18 26 28 32 42 48 51-53 57-60 
72 79-80 82-84 89-90 93 95 97-98 101- 
102 105-110 112-119 126 129 132 134- 
135 137 153-155 157 164-167 169 172 
174 180-181 184 191 194 197 201-202 
207 2 1 3 220 224 226 230 238 24 1 -242 
263 265 268 


fetal liver- 
spleen 


— r~- 

Columbia 

University 


rLM)U3 


5 9 21 26 28 90-91 93-94 99-100 106 109- 
110 113 115-117 121 133 136 143-144 
153 164-166 174 178 252 


fetal liver 


Invitrogen 




32 35 101-102 106 1 12 120 126 137 172- 
173 178 188 240 246 


fetal liver 


Clontech 


FLV002 


10 85 89 107 116 120 221 224 


fetal liver 


Clontech 


FLV004 


15 58 69-70 81 89-92 104-106 108 1 1 1 
113-114 122-123 136 147 154-155 164- 
167 169 172 199 201 203 230 253 


fetal muscle 


Invitrogen 


FMS001 


6 14 32 86 107 125 132 154 158 211 


fetal muscle 


Invitrogen 


FMS002 


1 1 14 41 51-53 64 7i 74 95 109-1 10115 
118 129 136 148 178 184 199-200 221 
242 253 255 


fetal skin 


Invitrogen 


FSK001 


1-4 6 10-1 1 13 15 24 29 78 86-87 91 97 
99-102 105-107 109-110 115 132 134 136 
138 147 153-154 158 164-167 169 178 
186 188 192 200 210 225 228 234-235 
238 240 242 


fetal skin 


Invitrogen 


FSK002 


5-6 8 15 28-29 51-53 55 60 71 74 76 78 89 
91-92 94 103 105-106 111-112 115 117- 
118 122-123 136 138-139 144 147 155 
157 161 178 188 190 198-201 204 209 
21 1 221 225 230 253 259-260 267 272 


umbilical cord 


BioChain 


FUC001 


4-5 28 38-39 78 80-81 84 86 99-102 104- 
106 109-1 10 1 13-1 16 121 124 126 132- 

147 1 SI 1SR 7HO 711 71 fk 7AO 7^7 


fetal brain 


GIBCO 


HFB001 


1-3 8-10 14 16 22 24 26 29 76 78-79 95 
101-102 104-105 108 1 1 1 113 1 15 1 18 
125-131 134 162 164-166 172 178 209 
220-221 224 244 


macrophage 


Invitrogen 


HMP001 


4 41 73 101-102 104 107-108 115 147 154 
159 169 183 196-197 199-200 219 


infant brain 


Columbia 
University 


IB2002 


7 10 14 16 22-23 25 29 31 36-3947 50-53 
59-60 64 76 81 87 99-100 105-108 1 12- 
113 115 121 135 137-140 146-147 153 
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Table 1 



Tissue origin 


Library/RNA 
source 


HYSEQ Library 
Name 


SEQ ID NO: 








158 161-162 167 173 178 192 199 213 
224-225 232-234 239-240 242 254 269 


infant brain 


Columbia 
University 


1B2003 


6 11 15-16 29 36-39 4 7 51-53 64 76 79 
87-88 109-110 113 128 132 137 144 146- 
147 153-154 158 161-162 173 178 192 
199-200 224-225 232 240 242 269 


infant brain 


Columbia 
University 


IBM002 


139 161 242 


infant brain 


Columbia 
University 


rosooi 


10 37 107 109-110112 162 173 269 


lung, fibroblast 


Stratagene 


LFB001 


4-5 15 28 41-42 57 72 76 80 99-100 107 
132 153 160 219 


lung tumor 


Invitrogen 


LGT002 


1-3 5-6 9-10 21 27-29 32 43 46 48 57 60 
78 84 87 104-106 109-113 115 118 122 
125 133-134 149 153 159 168 174 177- 
178 181 21 1 214 216 220 235 237-239 
242 252 265 267 


lymphocytes 


ATCC 


LPC001 


13 41 60 78 84 91 95 99-103 105 107 109- 
110 112-113 118 125-126 132-133 143 
153 159 173 181 187 200 207 225 240 246 

265 ; 


leukocyte 


GIBCO 


LUC001 


1-3 5-69 11 15 18-19 28 41 43 45 51-53 
57 60 74 78 80 82-83 93 95 97 99-100 
104-105 107-111 113-115 118 121-123 
125-126 132 137 144 146-148 150 155 
158-159 178 181 198-199 207 21 1 213 
223 235 246-247 253 


leukocyte 


Clontech 


LUC003 


60 99-100 105 132 154 


melanoma 
from-cell-line- 
ATCC-#CRL- 
1424 


Clontech 


MEL004 


99-100 106 120 144 157 169 191 211 219- 
220 264 


mammary gland 


Invitrogen 


MMG001 


4-7 1 1 13 15-16 25-26 28 38-39 74 79 84 
86-87 90-92 94 101-102 104 106-107 109- 
110 112-115 122 129 132 136 138 144 
147 153-154 157-158 164-166 168-169 
171-172 174-175 178 187-188 192 194 
208 221 240 242 263 265 


mixture 16 
tissues/mRNA 


various vendors 


SUP002 


15 38-39 44 85-86112 117120-121 123 
126 147 178 186 190 222 224 254 259- 
260 272 


mixture 16 
tissues/mRNA 


various vendors 


SUP008 


99-100 111 114 158 246 


mixture 16 
tissues/mRNA 


various vendors 


SUP009 


1-3 


induced neuron- 
cells 


Stratagene 


NTD001 


16 29 43 76 79 105 107 132 162 


retinoic acid- 

induced- 

neuronal-cells 


Stratagene 


NTR001 


47 109-110 115 118 154 157 159 178 199 
230 


neuronal cells 


Stratagene 


NTU001 


1-3 16 29 60 89 106 109-1 10 1 18 143 200 
209 


pituitary gland 


Clontech 


P1T004 


1-4 51-53 72 77 109-111 113 174 240 247 
263 265 


placenta 


Clontech 


PLA003 


1-3 30 71 89 97 104 115 161 169 184 199 
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Tissue origin 


Library/RNA 
source 


HYSEQ Library 
Name 


SEQ ID NO: 








216 


prostate 


Clontech 


PRT001 


1012 15 18 35 46 80 84 113 121 125 136 
154 159164-166 178 200 211 252 265 
267 273 


rectum 


Invitrogen 


REC001 


6 32 48 67 80 90 101-102 107 109-1 10 
122 154 159 168 173 192 221 229-230 
240 253 265-266 


salivary gland 


Clontech 


SAL001 


11 15 35 49 60 84 94 104 109-110 123 
134 137 174 178 246 


small intestine 


Clontech 


SIN001 


5-6 9 1 1 13 16 26 28-29 38-39 47 51-53 j 
57 72 76-77 80 86-87 91 93 101-102 104- 
105 107 109-110 113-114 120-122 126 
132 134 136 144 155 159 164-166 168 
181 188 209 234 240 247 252-254 265 
267 


skeletal muscle 


Clontech 


SKM001 


7 9 14 24 35 42 57 107 109-1 10 125 150 
153 195 


spinal cord 


Clontech 


SPC001 


1-3 23-24 38-39 41 46 87 91 99-103 109- 
111 113 115 118 125-126 132 145 153 
159 161-162 169 181 194 198-200 209 
211 224-225 231 247 252 272 


adult spleen 


Clontech 


SPLcOl 


6 15 82-83 91 107 114 147 159 178 181 
202 221246 


stomach 


Clontech 


STO001 


10 155891 


thalamus 


Clontech 


THA002 


16 76 87 90 104 132 153 157 162 172 
175-176 190 194 211 240 


thymus 


Clontech 


THM001 


1-3 26 32 38-3941 60 107 132 136 157 
211 231246 261 263-264 


thymus 


Clontech 


THMc02 


1-3 5 9 15-16 19 21 28 33 38-39 46 51-54 
58 71 75 80 82-83 91 93 95 97 103-105 
115 122 132-133 147 157 163 173 178 
186 190 194 199 204 211 219 225 230 235 
246 253 263 


thyroid gland 


Clontech 


THR001 


1-7 9 12-13 15 19 28 41 43 45 47 51-52 72 
78 80 82-84 86-87 93-95 99-100 104 106- 
110 115-116126130 136-139 154-155 
159-160 163 168 186-187 199-201 210- 
212 216 232 242 265 267 


trachea 


Clontech 


TRC001 


18 28-29 46 101-102 113 143 149 158 192 
194 211 238 240 


uterus 


Clontech 


UTR001 


30 38-39 86 121 132 137 150 155 


bone marrow 


STM001 


115 


199 



*The 16 rissue/mRNAs and their vendor sources are as follows: 1) Normal adult brain mRNA 
(Invitrogen), 2) Normal adult kidney mRNA (Invitrogen), 3) Normal fetal brain mRNA (Invitrogen), 4) Normal 
adult liver mRNA (Invitrogen), 5) Normal fetal kidney mRNA (Invitrogen), 6) Normal fetal liver mRNA 
(Invitrogen), 7) normal fetal skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) Human 
bone marrow mRNA (Clontech), 10) Human leukemia lymphoblastic mRNA (Clontech), 1 1) Human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human so\spinal cord mRNA (Clontech), 
14) human thyroid mRNA (Clontech), 15) human esophagus mRNA (BioChain), 16) human conceptional 
umbilical cord mRNA (BioChain). 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 


111 


gil321818 


Gallus gallus 


RING zinc finger protein 


1355 


91 


211 


gi2746333 


Homo sapiens 


RING zinc finger protein (RZF) mRNA, 
complete cds. 


1455 


100 


277 


gi3387925 


Homo sapiens 


clone 24450 RING zinc finger protein 
RZF mRNA, complete cds. 


1455 


100 


278 


gi2746333 


Homo sapiens 


RING zinc finger protein (RZF) mRNA, 
complete cds. 


1445 


94 


278 


gi3387925 


Homo sapiens 


clone 24450 RING zinc finger protein 
RZF mRNA, complete cds. 


1445 


94 


278 


gil4602541 


Homo sapiens 


ring finger protein 13, clone MGC: 13487 
IMAGE:3683407, mRNA, complete cds. 


1445 


94 


279 


gi2746333 


Homo sapiens 


RING zinc finger protein (RZF) mRNA, 
complete cds. 


1338 


100 


279 


gi3387925 


Homo sapiens 


clone 24450 RING zinc finger protein 
RZF mRNA, complete cds. 


1338 


100 


279 


gi 14602541 


Homo sapiens 


ring finger protein 13, clone MGC: 13487 
IMAGE:3683407, mRNA, complete cds. 


1338 


100 


280 


gi 10438603 


Homo sapiens 


cDNA: FLJ22282 fis, clone HRC03861. 


1341 


96 


280 


AAB24463 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 27 SEQIDNO:88. 


1341 


96 


280 


AAB34813 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 41 SEQ ID NO: 101. 


696 


93 


281 


gi6841548 


Homo sapiens 


HSPC163 


423 


100 


281 


gi 12653595 


Homo sapiens 


HSPC163 protein, clone MGC772 
IMAGE:3 1 63724, mRNA, complete cds. 


423 


100 


281 


AAY91543 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 93 SEQ ID NO:2 16. 


423 


100 


282 


gi2586350 


Homo sapiens 


tetraspan (NAG-2) mRNA, complete cds. 


842 


93 


282 


gi2997747 


Homo sapiens 


tetraspan TM4SF (TSPAN-4) mRNA, 
complete cds. 


842 


93 


282 


gil2653241 


Homo sapiens 


transmembrane 4 superfamily member 7, 
clone MGC:8437 IMAGE:2821236, 
mRNA, complete cds. 


842 


93 


283 


gil5080477 


Homo sapiens 


Similar to R1KEN cDNA 2310010G13 
gene, clone MGC:9810 IMAGE:3860434, 
mRNA, complete cds. 


2037 


97 


283 


gi9104959 


Xylella 

fastidiosa 9a5c 


beta-lactamase induction signal transducer 
protein 


161 


29 


283 


gil778812 


Neisseria 
gonorrhoeae 


No definition line found 


259 


27 


284 


gi 120532 15 


Homo sapiens 


mRNA; cDNA DKFZp434K2435 (from 
clone DKFZp434K2435); complete cds. 


2762 


100 


284 


AAY87197 


Homo sapiens 


Human secreted protein sequence SEQ ID 
NO:236. 


86 


24 


284 


AAY27598 


Homo sapiens 


¥1 * _■ .111. 

Human secreted protein encoded by gene 
No. 32. 


63 


29 


285 


gil0438815 


Homo sapiens 


cDNA: FU22427 fis, clone HRC09013. 


4487 


98 


285 


gi 15076843 


Homo sapiens 


pecanex-like protein 1 mRNA, complete 
cds. 


759 


44 


285 


gil3171105 


Takifugu 
rubripes 


pecanex 


685 


44 


286 


gi2828808 


Bacillus 
subtilis 


glucose transporter 


100 


23 


286 


gi 14023 148 


Mesorhizobiu 


probable fosmidomycin resistance protein 


112 


25 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


0/ 

Identity 






m loti 








286 


gi2650264 


Archaeoglobus 
fulgidus 


oxalate/formate antiporter (oxlT-2) 


102 


23 


287 


gil80137 


Homo sapiens 


Human membrane cofactor protein (MCP) 
mRNA, complete cds. 


1980 


96 


287 


AAW27484 


Homo sapiens 


Human MCP. 


1980 


96 


287 


gi5 12457 


Homo sapiens 


membrane cofactor protein 


1976 


95 


288 


gi 10437579 


Homo sapiens 


cDNA: FU21472 fis, clone COL04936. 


10)9 


100 


288 


AAE01687 


Homo sapiens 


Human gene 1 6 encoded secreted protein 
HDPMM88, SEQ ID NO:99. 


1019 


100 ! 


288 


gi 14043759 


Homo sapiens 


clone 1MAGE:41 1 1596, mRNA, partial 
cds. 


563 


58 


289 


AAY41401 


Homo sapiens 


Human secreted protein encoded by gene 
94 clone HLYCH68. 


392 


100 


289 


AAB08863 


Homo sapiens 


Amino acid sequence of a human 
secretory protein. 


392 


100 


289 


gi575398 


Saccharomyce 
s cerevisiae 


regulator of carbon catabolite repression 


54 


57 


290 


gil4250010 


Homo sapiens 


clone MGC: 14489 IMAGE:4244549, 
mRNA, complete cds. 


2035 


99 


290 


gil495419 


Homo sapiens 


H.sapiens ART3 gene. 


1713 


97 


290 


gi2677616 


Mus museums 


NAD(P)(+)~arginine ADP- 
ribosyl transferase 


1080 


58 


291 


gil3182757 


Homo sapiens 


HTPAP mRNA, complete cds. 


598 


100 


291 


AAB70690 


Homo sapiens 


Human hDPP protein sequence SEQ ID 
NO:7. 


598 


100 


291 


gi 14020949 


Arabidopsis 
thai i ana 


phosphatidic acid phosphatase 


250 


38 


292 


AAB88418 


Homo sapiens 


Human membrane or secretory protein 
clone PSEC0181. 


725 


100 


292 


gi2909844 


Homo sapiens 


prostate stem cell antigen (PSCA) mRNA, 
complete cds. 


t Aft 

109 


32 


292 


gi9367212 


Homo sapiens 


mRNA for prostate stem cell antigen 
(PSCA gene). 


1 Aft 


32 


293 


gil2718841 


Mus musculus 


Skullin 


283 


38 


293 


gi4191356 


Mus musculus 


claudin-6 


281 


Jo 


293 


gi!3543081 


Mus musculus 


claudin 6 


281 


38 


294 


gi2618609 


Capra hircus 


mhc class 11 DRA 


636 


80 


294 


gi 165868 


Ovis aries 


MHC Ovar-DR-alpha 


632 


79 


294 


gi207708 


Sciurus aberti 


MHC class II DR-alpha 


652 


82 


295 


gi 140252 14 


Mesorhizobiu 
m loti 


probable amidase 


348 


31 


295 


gi7226601 . 


Neisseria 

meningitidis 

MC58 


Glu-tRNA(Gln) amidotransferase, subunit 
A 


398 


28 


295 


gi7380209 


Neisseria 

meningitidis 

Z2491 


Ghi-tRNA(Gln) amidotransferase subunit 
A 


387 


27 


296 


gil2620132 


Homo sapiens 


renal sodium/sulfate cotransporter mRNA, 
complete cds. 


3100 


100 


296 


gi 10439272 


Homo sapiens 


cDNA: FU22760 fis, clone KA1A0881. 


3096 


99 


296 


gi310183 


Ratrus 
norvegicus 


sodium dependent sulfate transporter 


2627 


82 


297 


gi!2653037 


Homo sapiens | clone IMAGE:33558 1 3, mRNA, partial 


1574 


100 
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CPA 

SEQ 
ID 

iNU: 


Accession 
No. 


Species 


Description 


Score 


so 

Identity 








cds. 






291 


A A "\TA A*\ AC 

AAY44245 


Homo sapiens 


Human cell signalling protein-8. 


l one 


1UU 


297 


AAW64220 


Homo sapiens 


Human secreted protein from clone 
CG300 3. 


1195 


98 


298 


gi9588085 


Homo sapiens 


mRNA for TAPL, complete cds. 


2338 


99 


298 


gi9622987 


Homo sapiens 


ATP-binding cassette protein ABCB9 
(ABCB9) mRNA, complete cds. 


2338 


AA 

99 


298 


AAE02437 


Homo sapiens 


Human ATP binding cassette, ABCB9 
transporter protein. 


2338 


99 


299 


AAY87237 


Homo sapiens 


Human signal peptide containing protein 
HSPP-14SEQIDNO:14. 


1 1 A 

110 


1 A 

30 j 


299 


AAB87384 


Homo sapiens 


Human gene 43 encoded secreted protein 
HSLGM81, SEQ ID NO: 125. 


110 


30 


299 


AAB87410 


Homo sapiens 


Human gene 43 encoded secreted protein 

UCVTDX/ylt CCA IP\ VTA.IC1 

HJ>YBM41, bbQ ID NO: 151. 


1 10 


1 A 

30 


300 


_ * H oijl oo^ 

gi387488o 


Caenorhabditis 
elegans 


C4 1 C4.2 




>to 
4y 


300 


gi 137856 12 


Mus musculus 


sideroflexin 1 


404 


39 


300 


gi 13543 138 


Mus musculus 


RIlCbN cDNA ZolOuOzUU!) gene 


Af\A 




301 


gi5 114275 [ 


Homo sapiens 


MAB21L2 (MAB21L2) gene, complete 

COS. 


113 


33 


301 


gi9964007 


Homo sapiens 


MAB2 1 L2 protein (MAB2 1 L2) mRNA, 
complete cds. 


113 


33 


301 


gi 14 134002 


Homo sapiens 


A A A ¥1 T 1 T "» — iT>XT A 1,4. 1 

M AB2 1 L2 protein mRNA, complete cds. 


1 13 


33 


302 


gi7020704 


Homo sapiens 


cDNA rLJ20533 ns, clone KA1 10931. 


829 


AO 


302 


gil5O30135 


Mus musculus 


RIKEN cDNA 1 1 10020A09 gene 


111 


60 


302 


■ ^ rt ^ A A C\ A 

gi5824484 


Caenorhabditis 
elegans 


F32D8.5b 


111 


25 


303 


gi 10433539 


Homo sapiens 


cDNA FU 12133 lis, clone 
MAMMA1000278. 


319 


">A 

30 


303 


A A nAIOAl 

AAB93897 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 13844. 


319 


30 


303 


A A \\7 £.A A£.\ 

AAW64461 


Homo sapiens 


Human secreted protein from clone B121. 


ill 
313 


1A 

30 


304 


gi6841548 


Homo sapiens 


HSPC163 


489 


100 


304 


gi 12653595 


Homo sapiens 


HSPC163 protein, clone MGC:772 
IMAGE:3 163724, mRNA, complete cds. 


489 


100 


304 


AAY91543 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 93 SEQ ID NO:216. 


489 


100 


305 j 


gi4877582 


Homo sapiens 


lipoma HMGIC fusion partner (LHFP) 
mRNA, complete cds. 


222 


28 


305 


AAY87336 


Homo sapiens 


Human signal peptide containing protein 
HSPP-113SEQ1DN0:113. 


222 


28 


305 


A A U/DOCAO 

AAW88508 


Homo sapiens 


Human stomach cancer clone HP 1 0480- 
encoded membrane protein. 


A A 

94 


26 


306 


AAB87576 


T-Iomo cnnipnc 


Human PR03579 


1125 


98 


306 


gi2315510 


Caenorhabditis 
elegans 


similar to l-acyl-glycerol-3-phosphate 
acyltransferases 


501 


45 


306 


gi3877657 


Caenorhabditis 
elegans 


contains similarity to Pfam domain: 
PF01553 (Acyltransferase), Score=144.3, 
E-value=7.1e-40, N=l 


364 


44 


307 


AAY94954 


Homo sapiens 


Human secreted protein clone iw66_l 
protein sequence SEQ ID NO: 1 1 4. 


596 


68 


307 


gi7259234 


Mus musculus 


contains transmembrane (TM) region 


562 


63 


307 


AAB62810 


Homo sapiens 


Human nervous system associated protein 


536 


60 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 








NSPRT3 amino acid sequence. 






308 


gi4580997 


Mus musculus 


cAMP inducible 2 protein 


2377 


87 


308 


gi7543982 


Homo sapiens 


mRNA for glycerol 3-phosphate permease 
(SLC37A1 gene). 


842 


60 


308 


gi 11095363 


Homo sapiens 


glycerol 3-phosphate permease 
(SLC37AI) mRNA, complete cds. 


836 


60 


309 


AAG71797 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1478. 


755 


100 


309 


gi 12007408 


Mus musculus 


Bl olfactory receptor 


625 


79 


309 


gi 12007420 


Mus musculus 


B5 olfactory receptor 


609 


82 


310 


gi 12803871 


Homo sapiens 


clone MGC:4170 IMAGE:3618204, 
mRNA, complete cds. 


373 


100 


310 


gi3881055 


Caenorhabditis 
elegans 


Y48A6B.1 


57 


59 


310 


gil3398356 


Trichoplusia ni 


acyl-CoA delta- 1 1 desaturase 


46 


53 


311 


gill 128456 


Homo sapiens 


nicotinic acetylcholine receptor subunit 
alpha 10 mRNA, complete cds. 


2370 


100 


311 


gil3173184 


Homo sapiens 


nicotinic acetylcholine receptor subunit 
alpha 10 (CHRNA10) gene, complete cds. 


2370 


100 


311 


gi 12053839 


Homo sapiens 


mRNA for neuronal nicotinic 
acetylcholine alpha 10 subunit 
(NACHRA 10 gene). 


2370 


100 


312 


gi 14328885 


Mus musculus 


spermatogenic immunoglobulin 
superfamily protein 


630 


40 


312 


gi7767239 


Homo sapiens 


nectin-like protein 2 (NECL2) mRNA, 
complete cds. 


628 


41 


312 


gi45 19602 


Homo sapiens 


1GSF4 gene, exon 10 and complete cds. 


625 


40 


313 


AAA40083 
aal 


Homo sapiens 


Human brain-specific transmembrane 
glycoprotein encoding cDNA. 


1637 


54 


313 


AAB09968 


Homo sapiens 


Human brain-specific transmembrane 
glycoprotein. 


1637 


54 


313 


AAB 12448 


Homo sapiens 


Human hh00149 protein SEQ ID NO:4. 


1637 


54* 


314 


gil4017379 


Homo sapiens 


tumor endothelial marker 7 precursor 
(TEM7) mRNA, complete cds. 


2691 


100 


314 


AAB31211 


Homo sapiens 


Amino acid sequence of human 
polypeptide PRO6003. 


1297 


57 


314 


AAW58986 


Homo sapiens 


Homo sapiens adult brain clone CC1 94 4 
encoded protein. 


560 


99 


315 


gi 140 17379 


Homo sapiens 


tumor endothelial marker 7 precursor 
(TEM7) mRNA, complete cds. 


2592 


97 


315 


AAB3121 1 


Homo sapiens 


Amino acid sequence of human 
polypeptide PRO6003. 


1040 


53 


315 


AAW58986 


Homo sapiens 


Homo sapiens adult brain clone CC194 4 
encoded protein. 


461 


87 


316 


AA071567 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1248. 


1414 


100 


316 


AAG7I576 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1257. 


726 


52 


316 


AAG72477 


Homo sapiens 


Human OR-like polypeptide query 
sequence, SEQ ID NO: 2158. 


726 


52 


317 


gi!4495648 


Homo sapiens 


clone MGC: 15606 IMAGE:3163718, 
mRNA, complete cds. 


2958 


100 


317 


AAB74709 


Homo sapiens 


Human membrane associated protein 
MEMAP-15. 


338 


31 
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% 
Identity 


317 


gi7020023 


Homo sapiens 


cDNA FLJ20127 fis, clone COL06176. 


1 Af\ 

149 


29 


318 


AAB88430 


Homo sapiens 


Human membrane or secretory protein 
clone PSEC0205. 


2226 


99 


318 


AAY44363 


Homo sapiens 


Human cell cycle regulation protein-4. 


1827 


100 


318 


AAB08956 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 24 SEQ ID NO:l 13. 


1819 


99 


319 


AAY19506 


Homo sapiens 


Amino acid sequence of a human secreted 
protein. 


1120 


100 


319 


gill 177546 


Homo sapiens 


LIM2 (LIM2) and natural killer group 7 
(NKG7) genes, complete cds. 


90 


26 


319 


gi 13445660 


Homo sapiens 


& AT\ 4 f\ / Y ¥V T\ VIA 1 - 1 _ 

MP 19 (LIM2) mRNA, complete cds, 
alternatively spliced. 


90 


26 * 


320 


■■70 a r\r\f\ 

gi784990 


Homo sapiens 


H.sapiens DNA for 5-HT5A exonl. 


1645 


100 


320 


gi6064324 


unidentified 


GENE DU RECEPTEUR 5HT5A 

Y ¥¥ TV A A I\f 

HUMAIN 


1611 


98 


320 


AAR45848 


Homo sapiens 


Human 5HT5a serotonin receptor. 


1611 


98 


321 


gi2695874 


Homo sapiens 


ft t\\y A £* t\/\\r 1*1 aT> a. * 

H.sapiens mRNA for P2Y-like G-protem 
coupled receptor. 


175 


28 


321 


AAR53752 


Homo sapiens 


Seven transmembrane receptor (R12). 


175 


28 


321 


AAW07617 


Homo sapiens 


tt y** . » -i i • i*i 

Human G~protein thrombin-like receptor. 


175 


28 


322 


AAY25806 


Homo sapiens 


Human secreted protein fragment encoded 
from gene 23. 


1663 


98 


322 


gi5901846 


Drosophila 
melanogaster 


BcDNA.GH12144 


627 


43 


322 


AAB12140 


Homo sapiens 


Hydrophobic domain protein isolated from 
WERI-RB cells. 


353 


36 


323 


gil0438949 


Homo sapiens 


cDNA: FU22529 fis, clone HRC12842. 


1290 


100 


323 


AAB12119 


Homo sapiens 


Hydrophobic domain protein from clone 
HP02869 isolated from KB cells. 


448 


100 


323 


gi 13384443 


Caenorhabditis 
elegans 


similar to 1 -acyl-glycerol-3-phosphate 
acyl transferases 


294 


26 


324 


AAY25736 


Homo sapiens 


Human secreted protein encoded from 
gene 26. 


343 


100 


324 


gi 14530705 


Caenorhabditis 
elegans 


Similarity to C.elegans UNC-7 protein 
(SW:UNC7_CAEEL), contains similarity 
to Pfam domain: PF00876 (Innexin), 

o a* Af\ o r» i i a _. a or\ vi 1 

Score=640.8, E-value=2.4e-l89, N=l 


75 


36 


324 


gi 142083 


Anabaena sp. 
. 


ribulose 1,5-bisphosphate 
carboxylase/oxygenase small subunit 


63 


41 


325 


AAJJ44336 


Homo sapiens 


Human secreted protein encoded by gene 
2cloneHROAMll. 


169 


100 


3x5 


a a pmoni 
AAUU35U1 


Homo sapiens 


Human secreted protein, oLQ ID NO: 

/OOA. 


£.A 

64 


A 1 

41 


325 


gi6139004 


Echinococcus 

■111*1 UlOl 19 


NADH dehydrogenase subunit 6 


45 


55 


326 


gi!0566471 


Mus muse ul us 


Gliacolin 


1284 


94 


326 


gi 14278927 


Mus musculus 


gliacolin 


1284 


94 


326 


gi3747097 


Homo sapiens 


Clq-related factor mRNA, complete cds. 


974 


71 


327 


gi 13506225 


Mus musculus 


ST7 protein forml splice variant a 


2996 


99 


327 


gi9230665 


Homo sapiens 


FAM4A1 splice variant a (FAM4A1) 
mRNA, complete cds. 


1761 


96 


327 


gi 13506227 


Mus musculus 


ST7 protein forml splice variant b 


1761 


96 


328 


gi9230665 


Homo sapiens 


FAM4A1 splice variant a (FAM4A1) 
mRNA, complete cds. 


2496 


97 
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328 


gi 13506227 


Mus musculus 


ST7 protein forml splice variant b 


2489 


96 


328 


gi 13506225 


Mus musculus 


ST7 protein forml splice vanant a 


1366 


92 


329 


gi9230667 


Homo sapiens 


FAM4A1 splice vanant b (FAM4A1) 
mRNA, complete cds. 


2862 


97 


329 


gi 13506225 


Mus musculus 


ST7 protein forml splice vanant a 


2848 


96 


329 


gi9230665 


Homo sapiens 


FAM4A1 splice variant a (FAM4A1) 
mRNA, complete cds. 


1608 


92 


330 


gi292057 


Homo sapiens 


Human EBV induced G-protein coupled 
receptor (EB12) mRNA, complete cds. 


321 


38 


330 


AAR54080 


Homo sapiens 


Epstein Ban* virus induced (EBI-2) 
polypeptide. 


321 


38 


330 


AAW53623 


Homo sapiens 


Epstein Barr virus induced gene 2 (EBI-2). 


321 


38 


331 


gi 10434308 


Homo sapiens 


cDNA FLJ 12672 fis, clone 
NT2RM4002339. 


3584 


99 


331 


AAB94231 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14604. 


3584 


99 


331 


gi 10436632 


Homo sapiens 


cDNA FU14225 fis, clone 
NT2RP3004051. 


3570 


100 


332 


gi3462455 


Mus musculus 


junctional adhesion molecule 


116 


28 


332 


AAY23325 


Homo sapiens 


A33 related antigen JAM. 


116 


28 


332 


gi8650528 


Rattus 
norvegicus 


junctional adhesion molecule JAM 


109 


27 


333 


gil4250676 


Homo sapiens 


Similar to RIKEN cDNA 2310002F18 
gene, clone MGC: 1 04 1 3 
IMAGE:3954787, mRNA, complete cds. 


1977 


99 


333 


AAY27589 


Homo sapiens 


Human secreted protein encoded by gene 
No. 23. 


1578 


100 


333 


gi 12082328 


Arabidopsis 
thaliana 


para-hydroxy bezoate polyprenyl 
diphosphate transferase 


792 


64 


334 


gi 12655071 


Homo sapiens 


transmembrane 4 superfamily member 4, 
clone MGQ1477 IMAGE:3051146, 
mRNA, complete cds. 


859 


98 


334 


gi953239 


Homo sapiens 


Human intestinal and liver tetraspan 
membrane protein (il-TMP) mRNA, 
complete cds. 


859 


98 


334 


gi 1 1493837 


Rattus 
norvegicus 


tetraspan protein LRTM4 


791 


85 


336 


gi 14336694 


Homo sapiens 


16pl3.3 sequence section 2 of 8. 


4100 


99 


336 


gi 1 07 16072 


Homo sapiens 


mRNA for M83 protein, complete cds. 


4089 ; 


99 


336 


gi 107 16074 


Mus musculus 


M83 protein 


3115 


75 


337 


gil 1023146 


Homo sapiens 


corneal N-acetylglucosamine-6-O- 
sulforransferase (CHST6) mRNA, 
complete cds. 


2056 


100 


337 


gi 11023149 


Homo sapiens 


intestinal N-acetylgJucosamine-6-O- 

_l_-lltll_t- JL--J1 Hj-.-LJ— J- / /^"IJ CT"C\ „ _ J - - - . - - 1 XI 

sul to transferase (CHS 1 5) and corneal N- 
acerylglucosamine-6-O-sulfotransferase 
(CHST6) genes, complete cds. 


2056 


100 


337 


gil 2060804 


Homo sapiens 


N-acetyl glucosamine 6-0- sulfotransf erase 
GST-4beta mRNA, complete cds. 


2056 


100 


338 


AAG71850 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1531. 


1142 


71 


338 


AAG71809 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1490. 


1049 


74 


338 


AAG71818 


Homo sapiens 


Human olfactory receptor polypeptide, 


1014 


68 
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SEQ ID NO: 1499. 






339 


AAG71850 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1531. 


1128 


71 


339 


AAG7I809 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1490. 


1035 


74 


339 


AAG71818 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1499. 


1014 


68 


340 


gi7960136 


Homo sapiens 


neuroligin 3 isoform gene, complete cds, 
alternatively spliced. 


4557 


100 


340 


gil 145791 


Rattus 
norvegicus 


neuroligin 3 


4505 


98 


340 


gi7960135 


Homo sapiens 


neuroligin 3 isoform gene, complete cds, 
alternatively spliced. 


3623 


96 


341 


gi5525078 


Rattus 
norvegicus 


seven transmembrane receptor 


788 


31 


34) 


AAY57288 


Homo sapiens 


Human GPCR protein (HGPRP) sequence 
(clone ID 3036563). 


752 


29 


341 


AAY40440 


Homo sapiens 


Human brain-derived G~protein coupled 
receptor protein. 


746 


29 


342 


AAG71424 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1105. 


853 


88 


342 


AAG72315 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1996. 


915 


96 


342 


AAG71431 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQIDNO: 1112. 


595 


60 


343 


gi 10434098 


Homo sapiens 


cDNA FU12547fis, clone 
NT2RM4000634. 


1612 


84 


343 


AAB95124 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 17 122. 


1612 


84 


343 


gi854065 


Human 
herpesvirus 6 


U88 


809 


52 


344 


AAG71823 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQIDNO: 1504. 


1627 


100 


344 


AAG71859 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQIDNO: 1540. 


1085 


67 


344 


AAG72185 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQIDNO: 1866. 


980 


60 


345 


AAY91625 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 22 SEQ ID NO:298. 


1968 


94 


345 


AAU00437 


Homo sapiens 


Human dendritic cell membrane protein 
FIRE. 


1925 


78 


345 


AAY59300 


Homo sapiens 


Human EGPCR polypeptide. 


1174 


57 


346 


AAY91625 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 22 SEQ ID NO:298. 


1968 


94 


346 


A A 1 TAAA1T 

AAU0O437 


Homo sapiens 


Human dendritic cell membrane protein 
FIRE. 


1925 


78 


346 


AAY59300 


Homo sapiens 


Human EGPCR polypeptide. 


1174 


57 


347 


gi4098462 


Sus scrofa 


luteinizing hormone beta subunit 


41 


53 


347 


gil 2232003 


Cercopagis 
pengoi 


NADH dehydrogenase subunit 5 


81 


32 


348 


AAW74874 


Homo sapiens 


Human secreted protein encoded by gene 
146 cloneHSNAK17. 


349 


100 


348 


gi3329179 


Chlamydia 
trachomatis 


Phosphoglycerate Mutase 


68 


33 
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348 


gi9105100 


Xylella 

fastidiosa 9a5c 


transport protein 


68 


A S 

46 


349 


a a \rniiAi 

AAY04301 


Homo sapiens 


Human secreted protein encoded by gene 
9. 


82 


33 ! 


349 


_ » « C/VA A CIO 

gil5004512 


Podophyllum 
peltatum 


succinate dehydrogenase subunit 3 


79 


32 


349 


gi841378 


Saccharomyce 
s cerevisiae 


Gpi2p 


90 


34 


350 


AAB88406 


Homo sapiens 


Human membrane or secretory protein 
clone PSEC0162. 


1421 


99 


350 


AAW88579 


Homo sapiens 


Secreted protein encoded by gene 46 clone 
HCFMV39. 


479 


95 


350 


AAY411 1 1 


Homo sapiens 


Human TANGO 129 (T129) mature 
protein. 


225 


35 


351 


gi292793 


Homo sapiens 


(clone HB VT72) T cell receptor beta chain 
(TCRB) mRNA, VDJC region, partial cds. 


636 


98 


1 C 1 


gi457274 


Homo sapiens 


Human T-cell receptor beta chain gene, V 
region, partial cds. 


479 


98 


1 C 1 


g)495428 


Macaca 
mulatta 


T cell receptor beta chain 


477 


85 


352 


AAY 10839 


Homo sapiens 


Amino acid sequence of a human secreted 
protein. 


225 


95 


352 


gil5163613 


Agrobacterium 
tumefaciens 


AGR_pTi_226p 


66 


40 


1 c*i 

352 


gi90371 1 


Daucus carota 


cytochrome oxidase II 


59 


36 


353 


A A XT I /n O A 

AAY 1 6784 


Homo sapiens 


Human secreted protein (clone co 1 000 1 ). 


488 


100 


353 


gi 1850866 


Macropus 
robustus 


ATPase subunit 8 


68 


31 


353 


AAY41439 


Homo sapiens 


Fragment of human secreted protein 
encoded by gene 24. 


63 


43 


354 


gi6573749 


Arabidopsis ! 
tbaliana 


F20B24.9 


58 


38 


354 


gi325236 


Influenza B 
virus 


nb 


61 


34 




AAKI IZj4 


Homo sapiens 


Human IL-4 receptor. 


60 


52 


ICC 

355 


gi 1 2652903 


Homo sapiens 


clone MGC:3103 IMAGE:3350518, 
mRNA, complete cds. 


1704 


100 


ICC 

355 


AAA /(AAOl 

33 1 


Homo sapiens 


Human brain-specific transmembrane 
glycoprotein encoding cDNA. 


1019 


43 


355 


AAB09968 


Homo sapiens 


Human brain-specific transmembrane 
glycoprotein. 


1019 


43 


356 


gil0439087 


Homo sapiens 


cDNA: FU22625 fis, clone HSI06009. 


1792 


100 






Homo sapiens 


Human secreted protein encoded by gene 
oz cione nuunnj 1 . 


1 CCC 

1555 


94 


356 


AAY41747 


Hnmn Qanipnc 

l IXJUISJ MA^JISHfllO 


Unman PP A^l^ nrntpin cpni i*»nr**» 
nullum i ivwjjt piuiciii acifUCiJLv* 






358 


gil3676372 


Homo sapiens 


clone MGC:4595 IMAGE:3345729, 
mRNA, complete cds. 


1886 


98 


358 


AAY41690 


Homo sapiens 


Human PR0329 protein sequence. 


1886 


98 


358 

* 


AAB44246 


Homo sapiens 


Human PR0329 (UNQ291) protein 
sequence SEQ ID NO:45. 


1886 


98 


359 


gil3676372 


Homo sapiens 


clone MGC:4595 IMAGE:3345729, 
mRNA, complete cds. 


1905 


99 


359 


AAY41690 


Homo sapiens 


Human PR0329 protein sequence. 


1905 


99 


359 


AAB44246 


Homo sapiens 


Human PR0329 (UNQ291) protein 


1905 


99 
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sequence SEQ ID NO:45. 






360 


AAW74807 


Homo sapiens 


Human secreted protein encoded by gene 
79 clone HSKNE46. 


270 


100 


360 


gi2 145070 


Mus musculus 


ml7r splice variant 


49 


46 


360 


AAB34697 


Homo sapiens 


Human secreted protein encoded by DNA 
clone vq6 1 . 


66 


45 


361 


gi6959684 


Mus musculus 


glycolipid transfer protein 


103 


26 


361 


gil4041214 


Human 
herpesvirus 4 


EBNA-LP protein 


76 


36 


361 


gi6959686 


Homo sapiens 


glycolipid transfer protein mRNA, 
complete cds. 


93 


24 


362 


gil3623231 


Homo sapiens 


Similar to RIKEN cDNA 120001 3 A08 
gene, clone MGC:3047 LMAGE:3343261, 
mRNA, complete cds. 


2337 


100 


362 


gi 1404 1843 


Homo sapiens 


cDNA FU 14363 fis, clone 
HEMBA 10007 19. 


2270 


98 


362 


AAB92464 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 10520. 


2270 


98 


363 


gi 10438446 


Homo sapiens 


cDNA: FU22167 fis, clone HRC00584. 


1644 


100 


364 


gi 12053067 


Homo sapiens 


mRNA; cDNA DKFZp434I2 1 1 7 (from 
clone DKFZp434I21 17). 


1237 


100 


364 


gi 10438603 


Homo sapiens 


cDNA: FU22282 fis, clone HRC03861. 


649 


48 


364 


AAB24463 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 27 SEQ ID NO:88. 


649 


48 


365 


gi 12483888 


Homo sapiens 


solute carrier 19 A3 mRNA, complete cds. 


2549 


100 


365 


gi 14582572 


Homo sapiens 


orphan transporter SLC19A3 (SLC19A3) 
mRNA, complete cds. 


2549 


100 


365 


gi 12483890 


Mus musculus 


solute carrier 19 A3 


1716 


68 


366 


AAB74721 


Homo sapiens 


Human membrane associated protein 
MEMAP-27. 


558 


100 


366 


AAG03412 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
7493. 


464 


100 


366 


gi4929751 


Homo sapiens 


CGM41 protein mRNA, complete cds. 


406 


55 


367 


gil0434145 


Homo sapiens 


cDNA FU12576 fis, clone 
NT2RM4001032. 


2598 


100 


367 


gi 12803561 


Homo sapiens 


clone MGC:2991 IMAGE:3 160297, 
mRNA, complete cds. 


2598 


100 


367 


AAB94138 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14406. 


2598 


100 


368 


gi45 19535 


Homo sapiens 


CYP4F2 gene for leukotonene B4 omega 
hydroxylase, exon 13. 


1227 


65 


368 


gi 1857022 


Homo sapiens 


Human mRNA for leukotriene B4 omega- 
hydroxylase, complete cds. 


1227 


65 


368 


gi 10303605 


Homo sapiens 


CYP4F 1 1 mRNA, complete cds. 


1219 


64 


369 


gi 104388 15 


Homo sapiens 


cDNA: FU22427 fis, clone HRC09013. 


if to 

4518 


1 Art 

100 


369 


gi 15076843 


Homo sapiens 


pecanex-like protein I mRNA, complete 
cds. 


762 


44 


369 


gil3171105 


Takifugu 
rubripes 


pecanex 


578 


42 


370 


gi 12656635 


Homo sapiens 


transmembrane gamrna-carboxyglutarnic 
acid protein 4 TMG4 mRNA, complete 
cds. 


1201 


100 


370 


gil4603178 


Homo sapiens 


transmembrane gamrna-carboxy glutamic 
acid protein 4, clone MGC: 19793 


1201 


100 
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IMAGE:3841745, mRNA, complete cds. 






370 


AAB61219 


Homo sapiens 


Human TANGO 292 protein. 


1201 


100 


371 


gi7689031 


Homo sapiens 


uncharacterized hypothalamus protein 
HARP1 1 mRNA, complete cds. 


1847 


100 


371 


gil5080516 


Homo sapiens 


Similar to uncharacterized hypothalamus 
protein HARP 1 1 , clone MGC:9273 
IMAGE:3862712, mRNA, complete cds. 


1847 


100 


371 


AAY53029 


Homo sapiens 


Human secreted protein clone cwl 640_1 
protein sequence SEQ ID NO:64. 


1847 


100 


372 


gi 10440079 


Homo sapiens 


cDNA: FLJ23403 fis, clone HEP 18857. 


2817 


100 


372 


AAY53635 


Homo sapiens 


A bone marrow secreted protein 
designated BMS53. 


758 


50 


372 


gi 10439735 


Homo sapiens 


cDNA: FU23144 fis, clone LNG09262. 


771 


100 


373 


gi7023450 


Homo sapiens 


cDNA FIJI 1036 fis, clone 
PLACE1004289. 


980 


87 


373 


AAB93444 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 12686. 


980 


87 


373 


gil 199697 


Athalia rosae 


vitellogenin 


107 


42 


374 


gi!3447851 


Macaca 
mulatto 


killer immunoglobulin-like receptor 
KIR3DL7 


77 


31 


374 


gil90203 


Homo sapiens 


Human cardiac potassium channel 
(KCNA5) mRNA, complete cds. 


83 


33 


374 


gi308765 


Homo sapiens 


Human voltage-gated potassium channel 
(HK2) mRNA, complete cds. 


82 


35 


375 


gi5542014 


Homo sapiens 


DKC1 gene, exons 1 to 11. 


1574 


99 


375 


gi3873221 


Homo sapiens 


dyskerin (DKC1) mRNA, complete cds. 


1574 


99 


375 


gil4603090 


Homo sapiens 


dyskeratosis congenita 1, dyskerin, clone 
MGC:15313 IMAGE:4303933, mRNA, 
complete cds. 


1574 


99 


376 


gi5542014 


Homo sapiens 


DKC1 gene, exons 1 to 11. 


2399 


95 


376 


gi3873221 


Homo sapiens 


dyskerin (DK.C1) mRNA, complete cds. 


2326 


94 


376 


gil4603090 


Homo sapiens 


dyskeratosis congenita 1, dyskerin, clone 
MGC:15313 IMAGE:4303933, mRNA, 
complete cds. 


2326 


94 


377 


gil 2653555 


Homo sapiens 


lysophospholipase-like, clone MGC: 1216 
IMAGE:3 163689, mRNA, complete cds. 


907 


100 


377 


gil 3623261 


Homo sapiens 


lysophospholipase-like, clone 
MGC: 10338 IMAGE:3945191, mRNA, 
complete cds. 


907 


100 


377 


gil763011 


Homo sapiens 


Human tysophospholipase homolog (HU- 
K5) mRNA, complete cds. 


907 


100 


378 


gil2653555 


Homo sapiens 


lysophospholipase-like, clone MGC: 121 6 
IMAGE:3 163689, mRNA, complete cds. 


903 


100 


378 


gil3623261 


Homo sapiens 


lysophospholipase-like, clone 
MGC: 10338 IMAGE:3945191, mRNA, 
complete cds. 


903 


100 


378 


gil 763011 


Homo sapiens 


Human lysophospho lipase homolog (HU- 
K5) mRNA, complete cds. 


903 


100 


379 


AAY94946 


Homo sapiens 


Human secreted protein clone cd205_2 
protein sequence SEQ ID NO:98. 


571 


93 


379 


AAY53051 


Homo sapiens 


Human secreted protein clone ddl 19_4 
protein sequence SEQ ID NO: 108. 


324 


63 


379 


gi4097381 


Heteracris 
magnifica 


potassium channel toxin HmK 


61 


41 
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380 


gi6523817 


Homo sapiens 


SIR protein (SIR) mRNA, complete cds. 


928 


93 


JoU 


gwyzy /u / 


Homo sapiens 


COI-1 19 protein rnKNA, complete cds. 


V2o 




380 


AAY77122 


Homo sapiens 


Human neurotransmission-associated 
protein (NTAP) 414692. 


928 


93 


381 


gi6739575 


Mus musculus 


TBX2 protein 


696 


80 


381 


gi6980032 


Mus musculus 


ARL-6 interacting protein- 1 


696 


80 


381 


AAB54057 


Homo sapiens 


Human pancreatic cancer antigen protein 
sequence SEQ ID NO:509. 


70 


28 


382 


gil3432057 


Homo sapiens 


NYD-TSPG mRNA, complete cds. 


206 


25 


382 


AAB95759 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 18680. 


142 


29 


382 


gi!4550463 


Homo sapiens 


DKFZP434B103 protein, clone 
MGC: 15207 IMAGE:3841498, mRNA, 
complete cds. 


106 


32 


383 


AAY48312 


Homo sapiens 


Human prostate cancer-associated protein 
9. 


1509 


100 


383 


gi 12654077 


Homo sapiens 


clone IMAGE:3458173, mRNA, partial 
cds. 


1191 


100 


383 


AAY73387 


Homo sapiens 


HTRM clone 3340290 protein sequence. 


763 


82 


384 


gi!4042559 


Homo sapiens 


cDNA FU 14784 fis, clone 
NT2RP4000713. 


2492 


100 


384 


AAB93185 


Homo sapiens 


Human protein sequence SEQ ID 
NO:12134. 


2492 


too ; 


384 


AAB56514 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO: 1092. 


765 


98 


385 


' 1 *% f\ A A A «w 

gi 12044473 


Homo sapiens 


mRNA; cDNA DKFZp761D021 1 (from 
clone DKFZp761D02 11). 


2875 


100 


385 


gi 14336686 


Homo sapiens 


1 6pl 3.3 sequence section 1 of 8. 


2786 


98 


385 


AAB58984 


Homo sapiens 


Breast and ovarian cancer associated 
antigen protein sequence SEQ ID 692. 


759 


94 


386 


gil4336686 


Homo sapiens 


16pl3.3 sequence section 1 of 8. 


2811 


100 


386 


gi 12044473 


Homo sapiens 


mRNA; cDNA DKFZp761D021 1 (from 
clone DKFZp761D0211). 


2799 


98 


386 


AAB58984 


Homo sapiens 


Breast and ovarian cancer associated 
antigen protein sequence SEQ ID 692. 


683 


89 


387 


gi3879783 


Caenorhabditis 
elegans 


Similarity to Salmonella regulatory protein 

■ ri i r\y> /mil t ft it\/> Oat * w \ r\ 

UHPC(SW:UHPC SALTY) 


281 


53 


387 


gi7268507 


Arabidopsis 
tha liana 


glycerol-3-phosphate permease like 
protein 


207 


44 


387 


AAB39202 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 24 SbQ ID NO: 82. 


194 


38 


log 


glI4oOUoOZ 


Homo sapiens 


polyamine oxidase i so form- 1 mRNA, 
complete cds. 


638 


52 


TOO 

388 


gi7021037 


Homo sapiens 


cDNA FU20746 fis, clone HEP06040. 


637 


52 




A AR1 91 £d 


norno sapiens 


nyaropnooic oomain protein rrom cione 
HP10673 isolated from Thymus cells. 


O.J/ 


DA 


389 


gi5911897 


Homo sapiens 


mRNA; cDNA DKF2p586B1417 (from 
clone DKFZp586B1417); partial cds. 


6467 


96 


389 


gi 14424668 


Homo sapiens 


clone MGC14927 IMAGE:4298580, 
mRNA, complete cds. 


4267 ! 


94 


389 


gil0438036 


Homo sapiens 


cDNA: FLJ21846 fis, clone HEP01887. 


4259 


94 


390 


gil3529623 


Mus musculus 


Similar to RIKEN cDNA 49304 1 8P06 
gene 


1408 | 


81 


390 


gi5656743 


Homo sapiens 


BAC clone CTB-t22E10 from 7ql 1.23- 


105 


25 
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q2 1 . 1 , complete sequence. 






390 


AAB58323 


Homo sapiens 


Lung cancer associated polypeptide 
sequence SEQ ID 661. 


105 


25 


391 


gi 14603247 


Homo sapiens 


Similar to RIKEN cDNA 5730409G15 
gene, clone MGC: 19636 
IMAGE:2822323, mRNA, complete cds. 


754 


96 


391 


AAB36613 


Homo sapiens 


Human FLEXHT-35 protein sequence 
SEQ ID NO:35. 


754 


96 


391 


gi7022832 


Homo sapiens 


cDNA FLU 0661 fis, clone 
NT2RP2006106. 


240 


90 


392 


gi 10439204 


Homo sapiens 


cDNA: FLJ22709 fis, clone HSI13338. 


304 


39 


392 


AAB56085 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 9 SEQ ID NO: 1 79. 


304 


39 


392 


gi7407643 


Canis 
familiaris 


occludin IB 


177 


32 


393 


AAB 18993 


Homo sapiens 


Amino acid sequence of a human 
transmembrane protein. 


1212 


70 


393 


gil5079979 


Homo sapiens 


Similar to RIKEN cDNA 3830408P04 
gene, clone MGC: 19609 
IMAGE:3640970, mRNA, complete cds. 


1211 


70 


393 


gil31U831 


Homo sapiens 


clone IMAGE:3451448, mRNA, partial 
cds. 


980 


68 


394 


AAY59713 


Homo sapiens 


Secreted protein 76-20-3-H1-FL1 . 


865 


92 


394 


gi4220892 


Homo sapiens 


transcriptional co-activator CRSP34 
(CRSP34) mRNA, complete cds. 


920 


95 


394 


gi7141322 


Homo sapiens 


p37 TRAP/SMCC/PC2 subunit mRNA, 
complete cds. 


919 


95 


395 


gi3880799 


Caenorhabditis 
elegans 


Y39A1B.2 


837 


33 


395 


gi 1707052 


Caenorhabditis 
elegans 


similar to drosophilia and mouse patched 
proteins 


616 


35 


395 


gi861251 


Caenorhabditis 
elegans 


weakly similar to C. elegans protein 
F54G8.5 and to C. elegans protein 
F44F4.4 


475 


31 


396 


gi765240 


human, liver, 
mRNA, 1731 
nt]. [Homo 
sapiens 


hPPAR alpha =peroxisome proliferator 
activated receptor alpha 


2011 


99 


396 


AAR74053 


Homo sapiens 


Human peroxisome proliferator activated 
receptor. 


2011 


99 


396 


AAB20342 


Homo sapiens 


Peroxisome proliferator-acrivated receptor 
alpha. 


2011 


99 


397 


AAB43983 


Homo sapiens 


Human cancer associated protein sequence 
SEQ ID NO: 1428. 


1692 


100 


397 


AAA88691 
aal 


Homo sapiens 


Human transmembrane protein 
NPCAHH01 cDNA. 


1410 


100 


397 


gi5565977 


Homo sapiens 


transmembrane protein BRI (BRI) mRNA, 
complete cds. 


1409 


100 


398 


gi4894991 


Drosophila 
melanogaster 


sodium-hydrogen exchanger NHE1 


1362 


61 


398 


gi3979941 


Caenorhabditis 
elegans 


contains similarity to Pfam domain: 
PF00999 (Sodium/hydrogen exchanger 
family), Score=354.0, E-value=5.3e-103, 
N=l 


1059 


46 
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398 


gil4150471 


Homo sapiens 


nonselective sodium potassium/proton 
exchanger (NHE7) mRNA, complete cds. 


679 


40 


399 


gi7023154 


Homo sapiens 


cDNA FU10856 fis, clone 
NT2RP4001547. 


1617 


99 


399 


AAY28810 


Homo sapiens 


nn296 2 secreted protein. 


1617 


99 


399 


AAB93258 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 12282. 


1617 


99 


400 


AAG00388 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
4469. 


316 


100 


400 


gill 967794 


Echinops 
telfairi 


NADH dehydrogenase subunit 4L 


61 


29 


400 


gi32 11979 


Homo sapiens 


sarco-/endoplasmic reticulum Ca-ATPase 
3 (ATP2A3) mRNA, alternatively spliced, 
partial cds. 


54 


39 


401 


gil4043649 


Homo sapiens 


clone MGC:14I61 IMAGE:41 1 1078, 
mRNA, complete cds. 


253 


33 


401 


gi2623016 


Methanotherm 
obacter 
thermautotrop 
hicus 


heterodisulfide reductase, subunit C 


88 


30 


401 


gi4262178 


Arabidopsis 
thaliana 


25726 


87 


28 


402 


gi6164616 


Homo sapiens 


F-box protein Fbl3b (FBL3B) mRNA, 
partial cds. 


128 


26 


402 


AAY83075 


Homo sapiens 


F-box protein FBP-3b. 


128 


26 


402 


AAY83043 


Homo sapiens 


F-box protein FBP-3. 


109 


23 


403 


AAB98207 


Homo sapiens 


Human P24 protein-22 SEQ ID NO:2. 


1009 


99 


403 


gil890141 


Mus musculus 


P24 protein 


940 


91 


403 


gi 10439977 


Homo sapiens 


cDNA: FU23329 fis, clone HEP12646. 


274 


38 


404 


gi 13276693 


Homo sapiens 


mRNA; cDNA DKFZp761F069 (from 
clone DKFZp761F069); complete cds. 


807 j 


70 


404 


gi7020303 


Homo sapiens 


cDNA FU20300 fis, clone HEP06465. 


539 


39 


404 


AAB67575 


Homo sapiens 


Amino acid sequence of a human 
hydrolytic enzyme HYENZ7. 


435 


33 ' 


405 


gi3878748 


Caenorhabditis 
elegans 


Ml 76.4 


98 


24 


405 


gi7542459 


Taeniopygia 
guttata 


SWS1 opsin 


92 


29 


405 


AAB76874 


Homo sapiens 


Human lung tumour protein related 
protein sequence SEQ ID NO:799. 


65 


51 


406 


gi3880799 


Caenorhabditis 
elegans 


Y39A1B.2 


634 


25 


406 


gi861251 


Caenorhabditis 
elegans 


weakly similar to C. elegans protein 
F54G8.5 and to C. elegans protein 
F44F4.4 


261 


24 [ 


406 


gil 255388 


Caenorhabditis 
elegans 


similar to drosophila membrane protein 
PATCHED (SP:P18502) 


255 


26 


407 


gi 14603058 


Homo sapiens 


clone IMAGE:41 34852, mRNA, partial 
cds. 


1067 


100 


407 


gi!016178 


Cyanophora 
paradoxa 


PsaE 


53 


32 


407 


gil2724543 


Lactococcus 
lactis subsp. 
lactis 


UNKNOWN PROTEIN 


78 


43 
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408 


AAB12150 


Homo sapiens 


Hydrophobic domain protein isolated from 
HT-1080 cells. 


952 


100 


408 


gi 13096862 


Mus musculus 


RDCEN cDNA 9430096L06 gene 


845 


88 


408 


AAB29651 


Homo sapiens 


Human membrane-associated protein 
HUMAP-8. 


502 


100 


409 


gil5074997 


Sinorhizobium 
meliloti 


CONSERVED HYPOTHETICAL 
PROTEIN 


98 


32 


409 


AAG73357 


Homo sapiens 


Human gene 12-encoded secreted protein 
HBXAM53, SEQ ID NO: 1 28. 


57 


35 


409 


AAG73405 


Homo sapiens 


Human gene 12-encoded secreted protein 
HBXAM53, SEQ ID NO: 176. 


57 


35 


410 


gi 1669689 


Homo sapiens 


H.sapiens TAFU105 mRNA, partial. 


3902 


98 


410 


AAW31494 


Homo sapiens 


Human hTAFII105 protein. 


3902 


98 


410 


AAY57279 


Homo sapiens 


Transcription factor subunit TAFII105 
polypeptide. 


3902 


98 


411 


AAG71672 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1353. 


1202 


94 


411 


AAG72062 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1743. 


1068 


66 


411 


AAG71847 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1528. 


1051 


67 


412 


AAY 16630 


Homo sapiens 


Human Putative Adrenomedullin Receptor 
(PAR). 


1592 


99 


412 


gi292419 


Homo sapiens 


Human homologue of the canine orphan 
receptor (RDC1) mRNA, 5' end. 


1580 


98 


412 


gi899 


Canis 
familiaris 


RDC1 receptor (AA 1-362) 


1503 


92 


413 


AAY95002 


Homo sapiens 


Human secreted protein vc34_l, SEQ ID 
NO:44. 


985 


71 


413 


gil4550480 


Homo sapiens 


clone MGC: 16377 IMAGE:3936171, 
mRNA, complete cds. 


917 


97 


413 


gi7020918 


Homo sapiens 


cDNA FU20668 fis, clone KAIA585. 


179 


37 


414 


AAB56877 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO: 1455. 


1004 


98 


414 


gi 13991 373 


Hymenolepis 
diminuta 


NADH dehydrogenase subunit 4L 


62 


38 


414 


gi 144877 11 


Hepatitis C 
virus I 


polyprotein 


62 


50 


415 


gil79165 


Homo sapiens 


Human Na,K-ATPase subunit alpha 2 
(ATP1A2) gene, complete cds. 


5238 


99 


415 


gi203029 


Ratals 
norvegicus 


(Na+ and K+) ATPase, alpha+ catalytic 
subunit precursor 


5205 


98 


415 


gi2 12406 


Gallus gallus 


Na,K- ATPase alpha-2-subunit 


4977 


93 


416 


AAB90649 


Homo sapiens 


Human secreted protein, SEQ ID NO: 192. 


563 


92 


410 




Homo sapiens 


Human secreted protein, SEQ ID NO: 103. 


472 


100 


416 


AAB90651 


Homo sapiens 


Human secreted protein, SEQ ID NO: 194. 


203 


97 


417 


gi6599290 


Homo sapiens 


mRNA; cDNA DKFZp586C1021 (from 
clone DKFZp586C1021); partial cds. 


81 


25 


417 


gi7190652 


Chlamydia 
muridarum 


phosphoenolpyruvate-protein 
phosphotransferase 


89 


21 ! 


417 


gil4700035 


Aspergillus 
nidulans 


nuclear transport factor 2 


76 


37 


418 


gi!3249295 


Homo sapiens 


anion exchanger AE4 mRNA, complete 
cds. 


4951 


100 
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418 


gi 135 17508 


Homo sapiens 


sodium bicarbonate cotransporter 
(SLC4A9) mRNA, partial cds. 


4493 


95 


418 


gil 161 1537 


Oryctolagus 
cuniculus 


anion exchanger 4a 


4231 


85 


419 


gi2564913 


Homo sapiens 


clk2 kinase (CLK2), propin 1 , cole 1 , 
glucocerebrosidase (GBA), and metaxin 
genes, complete cds; metaxin pseudogene 
and glucocerebrosidase pseudogene; and 
thrombospondin3 (THBS3) gene, partial 
cds. 


1109 


82 


419 


gil 326 108 


Homo sapiens 


Human metaxin (MTX) gene, complete 
cds. 


1109 


82 


419 


gil 2804907 


Homo sapiens 


Similar to metaxin 1, clone MGQ2518 
IMAGE:3546178, mRNA, complete cds. 


1100 


99 


420 


gi2564913 


Homo sapiens 


clk2 kinase (CLK2), propin 1, cotel, 
glucocerebrosidase (GBA), and metaxin 
genes, complete cds; metaxin pseudogene 
and glucocerebrosidase pseudogene; and 
thrombospondin3 (THBS3) gene, partial 
cds. 


1665 


100 


420 


gil326108 


Homo sapiens 


Human metaxin (MTX) gene, complete 
cds. 


1665 


100 


420 v 


gi807670 


Mus musculus 


metaxin 


1519 


91 


421 


gi6094684 


Homo sapiens 


PAC clone RP1-278D1 from X, complete 
sequence. 


580 


30 


421 


gi7023516 


Homo sapiens 


cDNA FU11078fis, clone 

PLACE 1005 102, weakly similar to RING 

CANAL PROTEIN. 


547 


30 


421 


AAB93480 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 12768. 


547 


30 


422 


gil 47 15068 


Homo sapiens 


Similar to RIKEN cDNA 2600001A1 1 
gene, clone MGQ9907 IMAGE:3 870073, 
mRNA, complete cds. 


2062 


100 


422 


gi3 342906 


Homo sapiens 


2-amino-3-ketobutyrate-CoA hgase 
mRNA, nuclear gene encoding 
mitochondrial protein, complete cds. 


853 


89 


422 


gi4093159 


Mus musculus 


2-amino-3-ketobutyrate-coenzyme A 
ligase 


834 


87 


423 


AAB24058 


Homo sapiens 


Human PRO290 protein sequence SEQ ID 
NO:7. 


1972 


100 


423 


AAY66639 


Homo sapiens 


Membrane-bound protein PRO290. 


1972 


100 


423 


AAB65162 


Homo sapiens 


Human PRO290 (UNQ253) protein 
sequence SEQ ID NO:33. 


1972 


100 


424 


gil 67835 


Dictyostelium 
discoideum 


myosin heavy chain 


152 


24 


424 


gi 14042847 


Homo sapiens 


cDNA FU14957 fis, clone 
PLACE4000009, weakly similar to 
MYOSIN HEAVY CHAIN, 
NONMUSCLETYPEB. 


135 


26 


424 


AAB95546 


Homo sapiens 


Human protein sequence SEQ ID 
NO:18167. 


135 


26 


425 


AAB43587 


Homo sapiens 


Human cancer associated protein sequence 
SEQ ID NO: 1032. 


427 


100 


425 


AAG00658 


Homo sapiens 


Human secreted protein, SEQ ID NO: 


360 


97 
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4739. 






425 


AAG00657 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
4738. 


243 


72 


426 


gil3325388 


Homo sapiens 


Similar to RIKEN cDNA 1 1 10007C09 
gene, clone MGC: 11115 
IMAGE:38333 18, mRNA, complete cds. 


535 


99 


426 


AAB93133 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 12027. 


77 


30 


427 


gi7023138 


Homo sapiens 


cDNA FU 10847 fis, clone 
NT2RP4001379. 


731 


49 


427 


AAB93249 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 12263. 


731 


49 


427 


A AB 1 8977 


Homo sapiens 


Amino acid sequence of a human 
transmembrane protein. 


616 


89 


428 


AAB 18977 


Homo sapiens 


Amino acid sequence of a human 
transmembrane protein. 


1008 


100 


428 


gi7023138 


Homo sapiens 


cDNA FU 10847 lis, clone 
NT2RP4001379. 


765 


43 


428 


AAB93249 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 12263. 


765 


43 1 


429 


AAG03349 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
7430. 


59 


28 


429 


gil2620543 


Bradyrhizobiu 
mjaponicum 


ID263 


63 


30 


429 


AAY20368 


Homo sapiens 


Human microtubule associated protein 2 
mutant fragment 64. 


53 


40 


430 


gi7209839 


Homo sapiens 


mRNA for casein kinase I epsilon, 
complete cds. 


1564 


99 


430 


gil3676318 


Homo sapiens 


casein kinase 1, epsilon, clone 
MGQ10398 IMAGE:3937782, mRNA, 
complete cds. 


1564 


99 


430 


gi852057 i 


Homo sapiens 


casein kinase 1 epsilon mRNA, complete 
cds. 


1564 


99 


431 


gi2642187 


Ratals 
norvegicus 


endo-alpha-D-mannosidase 


1973 


87 


431 


gi 10434559 


Homo sapiens 


cDNA FU 12838 fis, clone 
NT2RP2003230, moderately similar to 
Rattus norvegicus endo-alpha-D- 
mannosidase (Enman) mRNA. 


1559 


99 


431 


AAB95204 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 17303. 


1559 


99 


432 


gil2044469 


Homo sapiens 


mRNA; cDNA DKFZp761H1710 (from 
clone DKFZp761H1710); complete cds. 


141 


37 


432 


gil5079305 


Mus musculus 


RIKEN cDNA 9130020G10 gene 


126 


37 


432 




flUJIIU oapiCIIo 


miviNA., carina ufur^jpH ^*#n i o i o (irom 
clone DKFZp434E1818); partial cds. 


1 1 A 


A 1 
Hi 


433 


gil2803977 


Homo sapiens 


clone MGC:4175 1MAGE:3634983, 
mRNA, complete cds. 


611 


100 


433 


AAB34781 


Homo sapiens 


Human secreted protein sequence encoded 
bygene9SEQIDNO:69. 


58 


39 


433 


AAW39938 


Homo sapiens 


Peptide effecting G-protein-coupled 
receptor activity. j 


57 


37 


434 


gi2150013 


Homo sapiens 


transmembrane protein mRNA, complete 
cds. 


1159 


100 
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434 


gil2803l97 


Homo sapiens 


c laud in 5 (transmembrane protein deleted 
in velocardiofacia) syndrome), clone 
MGC8543 IMAGE:2822745, mRNA, 
complete cds. 


1159 


100 


434 


AAY91533 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 83 SEQ ID NO:206. 


1159 


100 


435 


gi 15082442 


Homo sapiens 


clone MGQ20235 IMAGE:4562851, 
mRNA, complete cds. 


1368 


100 


435 


gi7023829 


Homo sapiens 


cDNAFUl 1273 Ms, clone 
PLACE 1009338. 


503 


42 


435 


AAB93645 


Homo sapiens 


Human protein sequence SEQ ID 
NO:13146. 


503 


42 


436 


gil 1640570 


Homo sapiens 


MSTP031 mRNA, complete cds. 


777 


100 


436 


AAY91516 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 66 SEQ ID NO: 189. 


70 


44 


436 


AAY91657 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 66 SEQ ID NO:330. 


70 


44 


437 


AAG73464 


Homo sapiens 


Human gene 7-encoded secreted protein 
fragment, SEQ ID NO:239. 


2267 


98 


437 


AAG73462 


Homo sapiens 


Human gene 7-encoded secreted protein 
fragment, SEQ ID NO:237. 


1898 


99 


437 


AAG73463 


Homo sapiens 


Human gene 7-encoded secreted protein 
fragment, SEQ ID NO:238. 


1881 


98 


438 


gi9886738 


Homo sapiens 


JP3 mRNA for junctophilin type3, 
complete cds. 


3916 


99 


438 


gi9927307 


Mus musculus 


junctophilin type 3 


3549 


90 


438 


gi9886757 


Homo sapiens 


JP3 gene for junctophilin type3, exon 5 
and partial cds. 


3172 


100 


439 


AAB08894 


Homo sapiens 


Human secreted protein sequence encoded 
bygene4SEQIDNO:51. 


240 


64 


439 


gi7414441 


porcine 

endogenous 

retrovirus 


envelope protein 


147 


28 


439 


gi348952 


Rat Leukemia 
virus 


envelope protein 


145 


26 


440 


gil3623369 


Homo sapiens 


clone IMAGE:3957135, mRNA, partial 
cds. 


2617 


100 


440 


AAB43484 


Homo sapiens 


Human cancer associated protein sequence 
SEQ ID NO:929. 


761 


100 


440 


gil4247685 


Staphylococcu 
s aureus subsp. 
aureus Mu50 


nicotinate phosphonbosyltransferase 
homolog 


370 


40 


441 


gil 3623369 


Homo sapiens 


clone IMAGE:3957135, mRNA, partial 
cds. 


2077 


94 


441 


A A I"> A ~l A ft A 

AAB43484 


Homo sapiens 


Human cancer associated protein sequence 
SEQ ID NO:929. 


761 


100 


441 


gi 14247685 


Staphylococcu 
s aureus subsp. 
aureus Mu50 


nicotinate phosphonbosyltransferase 
homolog 


370 


40 


442 


gi!3623369 


Homo sapiens 


clone IMAGE:3957135, mRNA, partial 
cds. 


2517 


97 


442 


AAB43484 


Homo sapiens 


Human cancer associated protein sequence 
SEQIDNO:929. 


761 


100 


442 


gi 14247685 


Staphylococcu 


nicotinate phosphonbosyltransferase 


370 


40 
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s aureus subsp. 
aureus Mu50 


homolog 






443 


gil 3 182757 


Homo sapiens 


HTPAP mRNA, complete cds. 


639 


65 


443 


AAB70690 


Homo sapiens 


Human hDPP protein sequence SEQ ID 
NO:7. 


639 


65 


443 


gi 14020949 


Arabidopsis 
thaliana 


phosphatidic acid phosphatase 


460 


39 


444 


gil0436254 


Homo sapiens 


cDNA FLU 3948 fis, clone 
Y79AA1001023. 


529 


41 


444 


AAB94837 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 16006. 


529 


41 


444 


gi7022187 


Homo sapiens 


cDNA FU10261 fis, clone 
HEMBB1000975. 


521 


42 


445 


gil403547 


Saccharomyce 
s cerevisiae 


P2558 protein 


162 


26 


445 


gi2621070 


Methanotherm 
obacter 
thermautotrop 
hicus 


ribosoraal protein S18 (E.coli SI 3) 


79 


33 


445 


gi4097361 


Human 
parainfluenza 
virus 1 


nucleocapsid protein 


59 


30 


446 


gi!5157363 


Agrobacterium 
tumefaciens 


AGR_C_4025p 


259 


32 


446 


gil 5075368 


Sinorhizobium 
meliloti 


CONSERVED HYPOTHETICAL 
PROTEIN 


251 


31 


446 


gil 5024663 


Clostridium 

acetobutylicu 

m 


Uncharacterized protein, YfiH family 


198 


28 


447 


gil 2584947 


Homo sapiens 


ovary-specific acidic protein mRNA, 
complete cds. 


1195 


100 


447 


gi632549 


Petromyzon 
marinus 


NF-180 


152 


30 


447 


gi4678807 


Homo sapiens 


Human gene from PAC 179D3, 
chromosome X, isoform of mitochondrial 
apoptosis inducing factor, AIF, 
AF100928. 


140 


32 


448 


AAX23994 
aal 


Homo sapiens 


Human CAR receptor DNA. 


1495 


99 


448 


gi458542 


Homo sapiens 


H.sapiens mRNA for orphan nuclear 
hormone receptor. 


1494 


99 


448 


AAR41346 


Homo sapiens 


Human CAR receptor polypeptide. 


1494 


99 


449 


gil 4625447 


Rattus 
norvegicus 


MT-protocadherin 


2566 


83 


AACk 

44 y 


AAr>lzl!>4 


Homo sapiens 


Hydrophobic domain protein isolated from 
WERI-RB cells. 


895 


100 


449 


gi!3537202 


Homo sapiens 


PC-LKC mRNA for protocadherin LKC, 
complete cds. 


445 


31 j 


450 


gil 0880797 


Mus muse ul us 


Syne- 1 A 


124 


27 


450 


gi5262574 


Homo sapiens 


mRNA; cDNA DKFZp434Gl73 (from 
clone DKFZp434G173); complete cds. 


108 


26 


450 


gil0880799 


Mus musculus 


Syne- IB 


124 


27 


451 


gil 1967375 


Rattus 
norvegicus 


Dvl-binding protein Idax 


1062 


100 
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451 


gil 1967377 


Homo sapiens 


Dvl-binding protein IDAX mRNA, 
complete cds. 


1062 


100 


451 


gi7023269 


Homo sapiens 


cDNA FU10920 fis, clone 
OV ARC 1 0003 84. 


348 


48 


452 


gi4929538 


Rattus 
norvegicus 


Olg-1 bHLH protein 


1088 


87 


452 


gil 1602814 


Mus musculus 


Oligl bHLH protein 


1070 


86 


452 


gi7385152 


Mus musculus 


oligodendrocyte- specific bHLH 
transcription factor Oligl 


1070 


86 


453 


gi3851514 


Phytophthora 
infestans 


cyst germination specific acidic repeat 
protein precursor 


874 


31 


453 


gi454154 


Homo sapiens 


intestinal mucin (MUC2) mRNA, 
complete cds. 


746 


26 


453 


gi296881 


Clostridium 
thermocellum 


S-layer protein 


678 


34 


454 


gi4929577 


Homo sapiens 


CG1-54 protein mRNA, complete cds. 


1552 


100 


454 


AAY 13942 


Homo sapiens 


Human transmembrane protein, HP01737. 


1552 


100 


454 


AAB36611 


Homo sapiens 


Human FLEXHT-33 protein sequence 
SEQ ID NO:33. 


1546 


99 


455 


gi295671 


Saccharomyce 
s cerevisiae 


selected as a weak suppressor of a mutant 
of the subunit AC40 of DNA dependant 
RNA polymerase I and III 


108 


21 


455 


gi2425111 


Dictyostelium 
discoideum 


ZipA 


107 


20 


455 


gil279563 


Medicago 
sativa 


nuMl 


104 


21 


456 


AAB58236 


Homo sapiens 


Lung cancer associated polypeptide 
sequence SEQ ID 574. 


286 


88 


456 


gi2065288 


Doryctobracon 
crawfordi 


cytochrome b 


61 


30 


456 


gil 653554 


Synechocystis 
sp. PCC6803 


CDP-diacylglycerol~glycerol-3-phosphate 
3-phosphatidyl transferase 


48 


45 


457 


gi3273731 


Homo sapiens 


MHC class 1 region. 


603 


95 


457 


gi3 12407 


Homo sapiens 


Human HLA-F gene for human leukocyte 
antigen F. 


603 


95 


457 


gil 4349362 


Homo sapiens 


Similar to major histocompatibility 
complex, class I, F, clone MGC: 1 5399 
IMAGE:4039990, mRNA, complete cds. 


599 


95 


A CO 

458 


AAG71945 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1626. 


1106 


96 


458 


AAG71532 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1213. 


1104 


96 


458 


AAG71525 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1206. 


641 


53 


*oy 


gn ioizu/y 


Homo sapiens 


DC-specific transmembrane protein 
mRNA, complete cds. 


*\A AO 

2448 


1 AA 


459 


AAE02638 


Homo sapiens 


Human dendritic cell specific 
transmembrane protein (DC-STAMP). 


2448 


100 


459 


AAB87357 


Homo sapiens 


Human gene 16 encoded secreted protein 
HMADJ14, SEQ ID NO:98. 


1798 


99 


460 


gi3006230 


Homo sapiens 


PAC clone RP4-604G5 from 7q22-q31.1, 
complete sequence. 


85 


35 


460 


gi47373 


Streptococcus 
pneumoniae 


7 kDa protein 


59 


42 
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460 


gi5880698 


Nephroselmis 
olivacea 


translational initiation factor 1 


57 


30 


461 


AAG73470 


Homo sapiens 


Human gene 14-encoded secreted protein 
fragment, SEQ ID NO:245. 


699 


100 


461 


gil0436625 


Homo sapiens 


cDNA FU14220 fis, clone 
NT2RP3003828. 


489 


53 


461 


AAB95779 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 18726. 


489 


53 


462 


gi7021367 


Drosophila 
melanogaster 


cll.l 


522 


27 


462 


gil2724134 


Lactococcus 
lactis subsp. 
lactis 


HYPOTHETICAL PROTEIN 


84 


33 


463 


gi7322066 


Drosophila sp. 


His 


367 


28 


463 


gi3309579 


Rattus 
norvegicus 


A-kinase anchor protein 121; AKAP 1 2 1 


155 


27 


463 


gi2072307 


Mus musculus 


AKAP121 


154 


27 


464 


AAB47106 


Homo sapiens 


Second splice variant of MAPP. 


4193 


99 


464 


AAB47105 


Homo sapiens 


First splice variant of MAPP. 


3311 


100 


464 


gi 14550 1 75 


Mus musculus 


ADAM33 


2684 


72 


465 


gil4091952 


Rattus 
norvegicus 


KIDINS220 


324 


27 


465 


gil 1321435 


Rattus 
norvegicus 


ankyrin repeat-rich membrane-spanning 
protein 


320 


27 


465 


gi6599237 


Homo sapiens 


mRNA; cDNA DKFZp434F062 1 (from 
clone DKFZp434F0621). 


220 


27 


466 


gi9864747 \ 


Leishmania 
major 


L165.9 


225 


35 


466 


gi3021392 


Homo sapiens 


H.sapiens mRNA for nuclear protein 
SDK3, partial. 


118 


34 


466 


gi5734402 


Homo sapiens 


mRNA for GANP protein. 


96 . 


27 


467 


gi 12002028 


Homo sapiens 


brain my040 protem mRNA, complete 
cds. 


482 


100 


467 


AAB56147 


Homo sapiens 


Human secreted protein sequence encoded 
by gene71SEQIDNO:241. 


74 


36 


467 


AAB56272 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 71 SEQ ID NO:366. 


74 


36 


468 


AAY94938 


Homo sapiens 


Human secreted protein clone ye78_l 
protein sequence SEQ ED NO: 82. 


2290 


97 


468 


gil3603412 


Homo sapiens 


B29 mRNA, complete cds. 


187 


30 


468 


•AAY 17227 


Homo sapiens 


Human secreted protein (clone yal-1). 


203 


26 


469 


AAY27721 


Homo sapiens 


Human secreted protein encoded by gene 
No. 29. 


1118 


88 


469 


AAB87068 


Homo sapiens 


Human secreted protein TANGO 365, 
SEQ ID NO:46. 


621 


99 


469 


AAB87146 


Homo sapiens 


Human secreted protein TANGO 365 
A5V variant, SEQ ID NO: 161. 


617 


98 


470 


gil0438739 


Homo sapiens 


cDNA: FU22376 fis, clone HRC07327. 


1931 


99 


470 


AAE03639 


Homo sapiens 


Human extracellular matrix and cell 
adhesion molecule-3 (XMAD-3). 


1934 


99 


470 


gi4033606 


Adiantum 

capillus- 

veneris 


Extensin 


200 


33 


471 


gil769467 


Homo sapiens 


Human p!26 (ST5) mRNA, complete cds. 1504 


70 
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471 


gil 769472 


Homo sapiens 


Human p82 (ST5) mRNA, alternatively 
spliced, complete cds. 


1504 


70 


471 


gi257387 


human, 

revertant clone 
F2, mRNA 
Partial, 2687 
nt). [Homo 
sapiens 


HTSl=HeLa tumor suppressor gene 


1504 


70 


472 


gi9944535 


Amsacta 
moorei 

entomopoxviru 

s 


AMV012 


69 


29 


472 


gi55950O 


Caenorhabditis 
elegans 


ND2 protein (AA 1 - 282) 


81 


35 


472 


gil5042251 


Chilo 

iridescent 

virus 


150R 


62 


36 


473 


gi55950O 


Caenorhabditis 
elegans 


ND2 protein (AA 1 - 282) 


91 


26 


473 


gi9944535 


Amsacta 
moorei 

entomopoxviru 
s 


AMV012 


69 


29 


473 


gi9944642 


Amsacta 
moorei 

entomopoxviru 

s 


AMV119 


73 


29 


474 


gi5739566 


Homo sapiens 


BAC clone CTA-332P12 from 7q22- 
q3 1 . 1 , complete sequence. 


907 


100 


474 


gi32474 


Homo sapiens 


H.sapiens h-Spl mRNA. 


907 


100 


474 


gi632790 


human, 
keratinocyte 
line HaCaT, 
mRNA, 2106 
1 nt]. [Homo 
sapiens 


pantophysin 


907 


100 


475 | 


gil4603247 


Homo sapiens 


Similar to RBCEN cDNA 5730409G15 
gene, clone MGC: 19636 
IMAGE:2822323, mRNA, complete cds. 


937 


100 


4/j 


A A T> ^C£.\ 1 

AABJOOl J 


Homo sapiens 


Human FLEXHT-35 protein sequence 
SEQ ID NO:35. 


937 


100 


475 


gl702Zo32 


Homo sapiens 


cUNA MJlUool ns, clone 
NT2RP2006106. 


240 


AA 

90 


47o 


gl5032674 


Drosophila 
rncionugab icr 


BcDNA.LD29892 


162 


38 


476 


AAB21007 


Homo sapiens 


Human nucleic acid-binding protein, 
NuABP-11. 


167 


39 


476 


gi9295345 


Homo sapiens 


HSKM-B (HSKM-B) mRNA, complete 
cds. 


173 


31 


477 


AAG71509 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1190. 


1510 


96 


477 


AAG71669 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1350. 


1198 


77 


477 


AAG71820 


Homo sapiens 


Human olfactory receptor polypeptide, 


1181 


75 
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SEQ ID NO: 1501. 






478 


AAY73483 


Homo sapiens 


Human secreted protein clone yll8_l 
protein sequence SEQ ID NO: 188. 


582 


47 


478 


AAW85723 


Homo sapiens 


Novel protein (Clone AX56 28). 


246 


34 


478 


AAG03191 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
7272. 


112 


30 


479 


gil 5079907 


Homo sapiens 


Similar to secretory carrier membrane 
protein 4, clone MGC: 19661 
IMAGE:3161979, mRNA, complete cds. 


1182 


94 


479 


gi9837305 


Rattus 
norvegicus 


secretory carrier membrane protein 4 


1012 


79 


479 


gi7021484 


Mus musculus 


secretory carrier membrane protein 4 


1006 


77 


480 


git345560 


Oryza sativa 


nitrate reductase apoenzyme (AA 394- 
471) (130 is 2nd base in codon) 


72 


44 


481 


gil 35 17508 


Homo sapiens 


sodium bicarbonate cotransporter 
(SLC4A9) mRNA, partial cds. 


5138 


100 


481 


gil4582760 


Homo sapiens 


anion exchanger AE4 mRNA, complete 
cds. 


4603 


96 


481 


gil 1611537 


Oryctolagus 
cuniculus 


anion exchanger 4a 


4080 


86 


482 


gi2570933 


Rattus 
norvegicus 


vanilloid receptor subtype 1 


986 


44 


482 


gi7544146 


Rattus 
norvegicus 


vanilloid receptor type 1 like protein 1 


979 


45 


482 


gil 1055318 


Rattus 
norvegicus 


vanilloid receptor-related osmotically 
activated channel 


951 


43 


483 


gi 14669436 


Homo sapiens 


alkaline phytoceramidase (APHC) mRNA, 
complete cds. 


110 


54 


483 


AAB 18986 


Homo sapiens 


Amino acid sequence of a human 
transmembrane protein. 


110 


54 


483 


gil 4488266 


Arabidopsis 
maliana 


Acyl-CoA independent ceramide synthase 


91 


33 


484 


gil2053091 


Homo sapiens 


mRNA; cDNA DKFZp434F1719 (from 
clone DKFZp434F1719); complete cds. 


615 


97 


484 


AAE01546 


Homo sapiens 


Human gene 1 encoded secreted protein 
HMVCQ82, SEQ ID NO:96. 


76 


39 


484 


gil 574439 


Haemophilus 
influenzae Rd 


leucine responsive regulatory protein (Irp) 


77 


36 


485 


AAY99347 


Homo sapiens 


Human PROl 1 13 (UNQ556) amino aacid 
sequence SEQ ID NO:24. 


2250 


99 


485 


AAB71863 


Homo sapiens 


Human hi 5571 GPCR. 


1834 


48 


485 


gi7407148 


Homo sapiens 


protocadherin Flamingo 2 (FMI2) mRNA, 
complete cds. 


306 


27 


486 


A A 11 FA i / T J 

AAW94654 


Homo sapiens 


G-protein coupled receptor HM74A 
proiein. 


887 


52 


486 


gi219867 


Homo sapiens 


Human mRNA for HM74. 


882 


52 


486 


AAY90637 


Homo sapiens 


Human G protein-coupled receptor HM74. 


882 


52 [ 


487 


gi3337385 


Homo sapiens 


Chromosome 16 BAC clone CIT987SK- 
A-761H5, complete sequence. 


1158 


83 


487 


gi2342743 


Homo sapiens 


Human Chromosome 16 BAC clone 
CIT987SIC-A-589H1, complete sequence. 


709 


59 


487 


gi4959568 


Homo sapiens 


nuclear pore complex interacting protein 
NPIP (NPIP) mRNA, complete cds. 


705 


58 


488 


gi7021167 


Homo sapiens 


cDNA FU20839 Ms, clone ADKA02346. 


551 


98 
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488 


gi9309293 


Homo sapiens 


hasc-1 mRNA for asc-type amino acid 
transporter 1, complete cds. 


551 


98 


488 


gi74 15938 


Mus musculus 


asc) 


460 


83 


489 


gi 14248997 


Homo sapiens 


lung seven transmembrane receptor I 
(LUSTR1) mRNA, complete cds. 


2239 


97 


489 


gil 0439034 


Homo sapiens 


cDNA: FU22591 fis, clone HSI03124. 


1515 


98 


489 


gi 14248999 


Mus musculus 


lung seven transmembrane receptor 2 


813 


49 


490 


AAY87079 


Homo sapiens 


Human secreted protein sequence SEQ ID 
NO:118. 


927 


82 


490 


gi3851540 


Homo sapiens 


brain mitochondrial carrier protein- 1 
(BMCP1) mRNA, nuclear gene encoding 
mitochondrial protein, complete cds. 


927 


82 


490 


gil 1094335 


Homo sapiens 


mitochondria] uncoupling protein 5 long 
form mRNA, complete cds; nuclear gene 
for mitochondrial product. 


927 


82 ! 


491 


AAG71803 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1484. 


1616 


100 


491 


AAG71807 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1488. 


1165 


69 


491 


AAG71805 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1486. 


1099 


83 


492 


gil 0440458 


Homo sapiens 


mRNA for FLJ00065 protein, partial cds. 


992 


100 | 


492 


gi938175 


Gallus gallus 


alpha 1 (XIV) collagen 


102 


32 


492 


gi211358 


Gallus gallus 


alpha- 1 collagen type IX 


63 


45 


493 


gi9963845 


Homo sapiens 


HT017 mRNA, complete cds. 


558 


38 


493 


AAW09405 


Homo sapiens 


Pineal gland specific gene-1 protein. 


558 


38 


493 


AAB69185 


Homo sapiens 


Human hISLR-iso protein SEQ ID NO:7. 


558 


38 


494 


gi61 79740 


Homo sapiens 


paraneoplastic neuronal antigen M A3 
(MA3) mRNA, complete cds. 


421 


51 


494 


gi!2053257 


Homo sapiens 


mRNA; cDNA DKFZp434K225 (from 
clone DKFZp434K225); complete cds. 


421 


51 


494 


AAB 12529 


Homo sapiens 


Human Ma5 protein SEQ ID NO: 1 3. 


421 


51 


495 


gil 3384467 


Caenorhabditis 
elegans 


contains similarity to CDP-alcohol 
phosphotransferases 


391 


35 


495 


gi3661595 


Arabidopsis 
thaliana 


aminoalcoholphosphotransferase 


411 


32 


495 


gi530088 


Glycine max 


aminoalcohohphosphotransferase 


410 


31 


496 


gi9963853 


Homo sapiens 


HT018 mRNA, complete cds. 


1368 


100 


496 


AAG71359 


Homo sapiens 


Human gene 10-encoded secreted protein 
fragment, SEQ ID NO:210. 


50 


50 


496 


AAY20863 


Homo sapiens 


Human presenilin I mutant protein 
fragment 9. 


61 


36 


497 


gil 324 1761 


Homo sapiens 


transmembrane protein induced by tumor 
necrosis factor alpha (TMPIT) mRNA, 
complete cds. 


1286 


70 


497 


AAB12123 


Homo sapiens 


Hydrophobic domain protein from clone 
HP 10608 isolated from Saos-2 cells. 


1286 


70 


497 


AAB38371 


Homo sapiens 


Human secreted protein encoded by gene 
51 clone HLDQC46. 


331 


67 


498 


AAY86234 


Homo sapiens 


Human secreted protein HNTNC20, SEQ 
ID NO: 149. 


126 


32 


498 


AAB24074 


Homo sapiens 


Human PROl 153 protein sequence SEQ 
IDNO:49. 


113 


54 


498 


AAY66735 


Homo sapiens 


Membrane-bound protein PROl 153. 


113 


54 
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499 


AAB93704 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 13287. 


3677 


99 


499 


gi2792496 


Rattus 
norvegicus 


tulip 2 


1339 


70 


499 


gi2792494 


Rattus 
norvegicus 


tulip 1 


1159 


48 


500 


gi!0438718 


Homo sapiens 


cDNA: FU22362 fis, clone HRC06544. 


1224 


100 


500 


gi3 10897 


Thermobifida 
fusca 


beta-M-endoglucanase precursor 


138 


36 


500 


AAY59066 


Homo sapiens 


Human tie receptor FNIII repeat fragment 

2. 


99 


26 


501 


gi45 19607 


Homo sapiens 


Nurrl gene, complete cds. 


1342 


100 


501 


gi4760535 


Homo sapiens 


gene for T-cell nuclear receptor NOT 
(Nurrl), complete cds. 


1342 


100 


501 


gil4424530 


Homo sapiens 


nuclear receptor subfamily 4, group A, 
member 2, clone MGC: 14354 
IMAGE:4298967, mRNA, complete cds. 


1342 


100 


502 


gi7288872 


Rattus 
norvegicus 


taste receptor rT2R6 


398 


32 


502 


gi7262617 


Homo sapiens 


candidate taste receptor T2R9 gene, 
complete cds. 


397 


33 


502 


AAB87739 


Homo sapiens 


Human T2R09 amino acid sequence SEQ 
ID NO: 17. 


397 


33 


503 


gi7022610 


Homo sapiens 


cDNAFLJ10521 fis, clone 
NT2RP2000841. 


3005 


98 


503 


AAB92909 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 11 539. 


3005 


98 


503 


gil31 11772 


Homo sapiens 


clone MGQ2899 IMAGE:30 10245, 
mRNA, complete cds. 


649 


99 


504 


AAB51244 


Homo sapiens 


Human haemopoietin receptor protein 
NR10.3 SEQ ID NO: 17. 


3066 


99 


504 


AAB51242 


Homo sapiens 


Human haemopoietin receptor protein 
NR10.1 SEQIDNO:2. 


3018 


100 


504 


AAB51243 


Homo sapiens 


Human haemopoietin receptor protein 
NR10.2 SEQ ID NO:4. 


885 


100 


505 


AAG71668 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1349. 


1547 


97 


505 


AAG71507 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1188. 


1399 


90 


505 


AAG71676 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1357. 


1126 


70 


506 


gil0438252 


Homo sapiens 


cDNA: FU22009 fis, clone HEP071 14. 


2022 | 


99 


506 


gi 12654279 


Homo sapiens 


clone IMAGE:3451 160, mRNA, partial 
cds. 


1975 


100 


cruc 
JVO 


gwlU2877 


Mus musculus 


She binding protein 


1915 


70 


507 


gil2248917 


Homo sapiens 


mRNA for spinesin, complete cds. 


1404 


100 


507 


AAB11699 


Homo sapiens 


Human serine protease BSSP2 (hBSSP2), 
SEQ ID NO: 10. 


1404 


too 


507 


AAB08950 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 22 SEQ ID NO: 107. 


1207 


100 


508 


gi7715916 


Mus musculus 


SorCSb splice variant of the VPS 10 
domain receptor SorCS 


4966 


96 


508 


gi6692583 


Mus musculus 


VPS10 domain receptor protein SORCS 


4961 


96 


508 


gi 12007720 


Mus musculus 


VPS 10 domain receptor protein SorCS2 


2613 


49 
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509 


gi 10566471 


Mus musculus 


Gliacolin 


1284 


94 


509 


gi 14278927 


Mus musculus 


gliacolin 


1284 


94 


509 


gi3747097 


Homo sapiens 


Clq-related factor mRNA, complete cds. 


974 


71 


510 


gi7332063 


Caenorhabditis 
elegans 


contains similarity to Strongylocentrotus 
purpuratus Spec3 protein (SP:P 16537) 


147 


41 


510 


gil2247892 


Sterkiella 

histriomuscoru 

m 


SPEC3-like protein 


85 


36 


510 


gi483822 


Gallus gallus 


vitellogenin gene-binding protein, 
alpha/alpha isoform 


73 


47 


511 


AAB25755 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 33 SEQ ID NO: 144. 


648 


100 


511 


AAB25754 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 33 SEQ ID NO: 143. 


301 


100 


511 


AAB25697 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 33 SEQ ID NO:86. 


278 


100 


512 


gil3810306 


Homo sapiens 


mRNA for transmembrane protein 7 
(TMEM7 gene). 


1271 


100 


512 


gil 1065721 


Homo sapiens 


mRNA for 28kD interferon responsive 
protein (IFRG28 gene). 


420 


45 


512 


AAB84453 


Homo sapiens 


Amino acid sequence of a human 
interferon-alpha induced protein. 


420 


45 


513 


AAG72504 


Homo sapiens 


Human OR-like polypeptide query 
sequence, SEQ ID NO: 2185. 


1615 


99 


513 


AAG71709 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1390. 


1611 


99 


513 


AAG72127 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1808. 


829 


99 


514 


AAB83079 


Homo sapiens 


Human CASB641 1 protein. 


1806 


100 


514 


AAB08764 


Homo sapiens 


A human leukocyte and blood related 
protein (LBAP). 


1424 


100 


514 


gi!0435645 


Homo sapiens 


cDNAFU13593fis, clone 
PLACE1009493. 


1124 


100 


515 


AAB74716 


Homo sapiens 


Human membrane associated protein 
MEMAP-22. 


1094 


99 


515 


gi6093235 


Homo sapiens 


mRNA; cDNA DKFZp566N034 (from 
clone DKF2p566N034); partial cds. 


424 


94 


515 


gil5157430 


Agrobacterium 
tumefaciens 


AGR_C4131p 


131 


25 


516 


gil3447610 


Homo sapiens 


VTS20631 mRNA, g-protein coupled 
receptor family, partial cds. 


3804 


99 


516 


gil 044 1732 


Homo sapiens 


leucine-rich repeat-containing G protein- 
coupled receptor 6 (LGR6) mRNA, partial 
cds. 


3782 


100 


516 


gi3366802 


Homo sapiens 


orphan G protein-coupled receptor HG38 
mRNA, complete cds. 


1805 


52 


517 


AAB24465 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 29 SEQ ID NO:90. 


447 


98 


517 


gil749851 


Human 

immuriodeficie 
ncy virus type 
1 


tat protein 


60 


36 


517 


gi2245481 


Human 

immunodeficie 


Tat protein 


59 


33 
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CPA 

SEQ 
ID 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 






ncy vims type 

i 
1 








518 


gi5802879 


Homo sapiens 


AIM-1 protein mRNA, complete cds. 


458 


44 


518 


— . T t ^A^O All 

gi 15028433 


Mus muse ul us 


B/AIM- 1 -like protein 


453 


45 


518 


gi4680229 


Homo sapiens 


DNb-5 mRNA, partial cds. 


498 


41 


519 


gi5525078 


Rattus 
norvegicus 


seven transmembrane receptor 


788 


31 


519 


AAY57288 


Homo sapiens 


Human GPCR protein (HGPRP) sequence 
(clone ID 3036563). 


752 


29 


519 


AAY40440 


Homo sapiens 


Human brain-derived G-protein coupled 
receptor protein. 


746 


29 


520 


AAY27577 


Homo sapiens 


Human secreted protein encoded by gene 
No. 11. 


598 


100 


520 


gil617316 


Homo sapiens 


H.sapiens mRNA for tenascin-R. 


97 


26 


520 


gi4379056 


Homo sapiens 


H. sapiens mRNA for tenascin-R 
(restrictin). 


97 


26 


521 


gil0434488 


Homo sapiens 


cDNA FU 12791 fis, clone 
NT2RP2001991, highly similar to 
SODIUM- AND CHLORIDE- 
DEPENDENT TRANSPORTER NTT73. 


1523 


100 


521 


AAB94304 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14767. 


1523 


100 


521 


gi 11907841 


Homo sapiens 


orphan neurotransmitter transporter v7-3 
mRNA, complete cds. 


1353 


92 


522 


gi 10437307 


Homo sapiens 


cDNA: FU21240 fis, clone COL01 132. 


677 


38 


522 


AAY94906 


Homo sapiens 


Human secreted protein clone rb649_3 
protein sequence SEQ ID NO: 1 8. 


644 


37 


522 


AAB74730 


Homo sapiens 


Human membrane associated protein 
MEMAP-36. 


644 


37 


523 


AAB43665 


Homo sapiens 


Human cancer associated protein sequence 
SEQIDNO:1110. 


1254 


100 


523 


A A \7 t A"f C A 

AAY 19759 


Homo sapiens 


SEQ ID NO 477 from W09922243. 


966 


100 


523 


gi 12804249 


Homo sapiens 


Similar to gene rich cluster, C9 gene, 
clone MGC:25 19 IMAGE:3546861, 
mRNA, complete cds. 


411 


46 




A A DAI^IC 

AAoUiOx5 


Homo sapiens 


Human G-protein coupled receptor fb41a. 


1925 


94 


524 


AAB70143 


Homo sapiens 


Human G protein-coupled receptor 
protein. 


1925 


94 


524 


AAW79258 


Homo sapiens 


Human G protein coupled receptor 15 E. 


1877 


93 


525 


gi7023154 


Homo sapiens 


cDNA FIJI 0856 fis, clone 
NT2RP4001547. 


943 


53 


525 


A A VF^ O O 1 A 

AAY28810 


Homo sapiens 


nn296_2 secreted protein. 


943 


53 


525 I 


AAB93258 


Homo sapiens 



Human protein sequence SEQ ID 
NO: 12282. 


943 


53 




gll 1 0 /oU_>0 


Sus scrofa 


somatostatin receptor 1 


198 


25 


526 


gi!2056166 


Yaba-like 
disease virus 


7L protein 


196 


26 


526 


gi 13876663 


lumpy skin 
disease virus 


G-protein-coupled chemokine receptor- 
like protein 


197 


25 


527 


gi3880799 


Caenorhabditis 
elegans 


Y39AIB.2 


441 


24 


527 


gil 707052 


Caenorhabditis 
elegans 


similar to drosophilia and mouse patched 
proteins 


368 


23 


527 


gi!255388 


Caenorhabditis 


similar to drosophila membrane protein | 191 


23 
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SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


AS 

Identity 






elegans 


PATCHED (SP: PI 85 02) 






528 


AAB34321 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 23 SEQ ID NO:82. 


74 


38 


528 


AAB51693 


Homo sapiens 


Human secreted protein related amino acid 
sequence SEQ ID NO: 133. 


51 


55 


528 


AAB87388 


Homo sapiens 


Human gene 47 encoded secreted protein 
HFXDK20, SEQ ID NO: 129. 


68 


44 


529 


AAY94297 


Homo sapiens 


Human coenzyme A-uti!ising enzyme 
CoAEN-5. 


1581 


69 


529 


AAY66699 


Homo sapiens 


Membrane-bound protein PROl 108. 


1581 


69 


529 


AAB65222 


Homo sapiens 


Human PROl 108 (UNQ551) protein 
sequence SEQ ID NO:248. 


1581 


69 


530 


AAY29332 


Homo sapiens 


Human secreted protein clone pe584_2 
protein sequence. 


1282 


99 


530 


AAB58289 


Homo sapiens 


Lung cancer associated polypeptide 
sequence SEQ ID 627. 


1282 


99 


530 


AAB75246 


Homo sapiens 


Human secreted protein sequence encoded 
bygene7SEQIDNO:65. 


1282 


99 


531 


AAB08538 


Homo sapiens 


A human G-protein coupled receptor 
designated 14273. 


787 


100 


531 


AAY44662 


Homo sapiens 


Human 14273 G-protein coupled receptor 
(GPCR). 


765 


98 


531 


AAY44815 


Homo sapiens 


Human 14273 G-protein coupled receptor 
(GPCR) version 2. 


761 


97 


532 


AAG71706 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1387. 


1579 


99 


532 


AAG71705 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1386. 


1180 


74 


532 


AAG71679 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1360. 


1089 


68 


533 


gi557822 


Saccharomyce 
s cerevisiae 


mal5, stal, len: 1367, CAI: 0.3, 
AMYH YEAST P08640 
GLUCOAMYLASE SI (EC 3.2.1 .3) 


362 


27 


533 


gi 1304387 


Saccharomyce 
s cerevisiae 
var. diastaticus 


glucoamylase 


362 


27 


533 


gi7332056 


Caenorhabditis 
elegans 


contains similarity to Pfam family 

DCAAA")0 fO - - - - * /n\l A 

FrOUU/o (Reverse transcriptase (RNA- 
aepenaeni j;, score— /y.o, fc^Cje-xU, t— I 


345 


27 


jJ't 




— : 

Homo sapiens 


Human dendritic cell membrane protein 

PTPP 


1 OA 1 


Ol 

91 




A AVOIDS 


no mo sapiens 


UllffVl'lfl PO/*rotA/1 nrAfain t* natioa aM/«#\/l/vrl 

numan secreiea protein sequence encoded 
by gene 22 SEQ ID NO:298. 


loH\) 


Qf\ 




a a v^o^nn 


Homo sapiens 


numan ti\jrv-.K polypeptide. 


1101 


<o 

Jo 


535 


ei 104387 10 


Homo aniens 


cDNA- FLJ223S7 fi«s clone HRP0M04 




inn 

lUv 


535 


gil4336678 


Homo sapiens 


16pl3.3 sequence section I of 8. 


4547 


99 


535 


AAB61148 


Homo sapiens 


Human NOV 17 protein. 


1955 


67 


536 


gil0438710 


Homo sapiens 


cDNA: FU22357 fls, clone HRC06404. 


4379 


100 


536 


gi 14336678 


Homo sapiens 


16pl3.3 sequence section 1 of 8. 


4354 


99 ! 


536 


AAB61148 


Homo sapiens 


Human NOV 17 protein. 


1955 


67 


537 


gi 10439790 


Homo sapiens 


cDNA: FU23186 fis, clone LNG1 1945. 


753 


99 


537 


gi310100 


Rattus 
norvegicus 


developmentally regulated protein 


86 


30 


537 


gi5824457 


Caenorhabditis 


contains similarity to Pfam domain: 


78 


30 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 






elegans 


PF00615 (Regulator of G protein signaling 
domain), Score=200.4, E-value=9.1e-57, 
N=l 






538 


AAG71899 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1580. 


1603 


100 


538 


gi5869925 


Mus musculus 


olfactory receptor 


1322 


82 


538 


AAG71954 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1635. 


883 


54 


539 


gi466604 


Escherichia 
coli 


No definition line found 


90 


25 


539 


gi52952 


Mus musculus 


delta-aminolevulinate dehydratase (AA 1 - 
330) 


82 


35 


539 


gi4262032 


Bos taunis 


D5 dopamine receptor 


59 


64 


540 


gil2803977 


Homo sapiens 


clone MGC:4175 IMAGE:3634983, 
mRNA, complete cds. 


611 


100 


540 


AAB34781 


Homo sapiens 


Human secreted protein sequence encoded 
by gene 9 SEQ ID NO:69. 


58 


39 


540 


AAW39938 


Homo sapiens 


Peptide effecting G-prote in-coupled 
receptor activity. 


57 


37 


541 


AAY73442 


Homo sapiens 


Human secreted protein clone ya66_l 
protein sequence SEQ ID NO: 106. 


596 


95 


541 


AAB63255 


Homo sapiens 


Human breast cancer associated antigen 
protein sequence SEQ ID NO:617. 


95 


40 


541 


gil3182890 


Macaca 
mulatta 


collagen type III alpha 1 


79 


46 


542 


gi9929914 


Homo sapiens 


MUC3B gene for intestinal mucin, partial 
cds. 


4024 


99 


542 


gi9929918 


Homo sapiens 


MUC3B mRNA for intestinal mucin, 
partial cds. 


4024 


99 


542 


gil 1990203 


Homo sapiens 


partial MUC3B gene for MUC3B mucin, 
exonsl-U. 


3985 


98 


543 


gil4043332 


Homo sapiens 


Similar to ring finger protein 23, clone 
MGC:2475 IMAGE:305 1 389, mRNA, 
complete cds. 


925 


40 


543 


* i « /•Ann 

gi 107 16078 


Mus musculus 


testis-abundant finger protein 


919 


40 


543 


gil 24074 17 


Mus musculus 


tripartite motif protein TRIM1 1 


671 


35 


544 


gi57131 


Rattus 
norvegicus 


ribosomal protein S26 


260 


68 


544 


gil 2803549 


Homo sapiens 


ribosomal protein S26, clone MGC: 1963 
IMAGE:3 143099, mRNA, complete cds. 


260 


68 


544 


gi456351 


Homo sapiens 


H.sapiens RPS26 mRNA. 


260 


68 


545 


gil 0438861 


Homo sapiens 


cDNA: FU22461 fis, clone HRC10107. 


1258 


42 


545 


gil 5079400 


Homo sapiens 


clone MGC: 16796 1MAGE:3855477, 
mRNA, complete cds. 


1258 


42 


C AC 

545 


gioo83905 


Drosophila 
melanogaster 


Dispatched 


412 


37 


546 


AAY72910 


Homo sapiens 


Human 1GS3 G-protein coupled receptor 
(GPCR) protein. 


589 


58 


546 


AAB67654 


Homo sapiens 


Amino acid sequence of a human G- 
protein coupled receptor (Ant). 


589 


58 


546 


AAF55661 
aal 


Homo sapiens 


Nucleotide sequence of a human G-protein 
coupled receptor (Ant). 


589 


58 


547 


gi6740013 


Homo sapiens 


clone cDSCl Down syndrome cell 
adhesion molecule (DSC AM) mRNA, 


6373 


60 
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ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 








complete cds. 






547 


AAW42086 


Homo sapiens 


Human Down syndrome-cell adhesion 
molecule DS-CAM1. 


6347 


62 


547 


gi 11066998 


Mus muse ill us 


Down syndrome cell adhesion molecule 


6344 


60 


548 


gi 12656633 


Homo sapiens 


transmembrane gamma-carboxyglutamic 
acid protein 3 TMG3 mRNA, complete 
cds. 


1192 


100 


548 


gi2338290 


Homo sapiens 


proline-nch Gla protein 1 (PRGP1) 
mRNA, complete cds. 


283 


49 


548 


gi506601 


Ratals 
norvegicus 


factor X 


206 


49 


549 


gi 12698682 


Homo sapiens 


testis-expressed transmembrane-4 protein 
(TETM4) mRNA, complete cds. 


588 


95 


549 


gil 1559214 


Homo sapiens 


mRNA for MS4A5, complete cds. 


588 


95 


549 


git 3649401 


Homo sapiens 


MS4A5 protein mRNA, complete cds. 


588 


95 


550 


gil2054393 


Homo sapiens 


6M 1-10*01 gene for olfactory receptor, 
cell line BM28.7. 


1853 


100 


550 


gil2054395 


Homo sapiens 


6M1- 10*01 gene for olfactory receptor, 
cell line BM 19.7. 


1853 


100 


550 


gil2054397 


Homo sapiens 


6M1-10*01 gene for olfactory receptor, 
cell line LG2. 


1853 


100 


551 


gil 1275360 


Homo sapiens 


SLC4A10 mRNA for NCBE, complete 
cds. 


5677 


99 


551 


gill 182364 


Mus musculus 


NCBE 


5542 


96 


551 


gi7385123 


Mus musculus 


sodium bicarbonate cotransporter isoform 
3 kNBC-3 


4364 


76 


552 


AAE04178 


Homo sapiens 


Human gene 3 encoded secreted protein 
fragment, SEQ ID NO: 169. 


1111 


98 


552 


AAE04I27 


Homo sapiens 


Human gene 3 encoded secreted protein 
HSDJL42, SEQ ID NO: 114. 


1078 


98 


552 


AAE04102 


Homo sapiens 


Human gene 3 encoded secreted protein 
HSDJL42, SEQIDNO:88. 


1068 


98 



WO 03/025148 



PCT/US02/29964 



150 
Table 2B 



SEQ 
ID 


Hit ID 


Species 


Description 


S 
score 


Percent 
identity 


277 


AAY55787 


Homo 
sapiens 


INCY- Human zinc RIMG (ZIRI) protein. 


1859 


95 


277 


AAW81821 


Homo 
sapiens 


INCY- Human ZIRI protein. 


1859 


95 


277 


gi3387925 


Homo 
sapiens 


RING zinc finger protein RZF 


1859 


95 


278 


AAY55787 


Homo 
sapiens 


INCY- Human zinc RING (ZIRI) protein. 


1703 


88 


278 


AAW81821 


Homo 
sapiens 


INCY- Human ZIRI protein. 


1703 


88 


278 


gi3387925 


Homo 
sapiens 


RING zinc finger protein RZF 


1703 


88 


279 


AAY55787 


Homo 
sapiens 


INCY- Human zinc RING (ZIRI) protein. 


1769 


92 


279 


AAW81821 


Homo 
sapiens 


INCY- Human ZIRI protein. 


1769 


92 


279 


gi3387925 


Homo 
sapiens 


RING zinc finger protein RZF 


1769 


92 


280 


AAB24463 


Homo 
sapiens 


HUMA- Human secreted protein sequence 
encoded by gene 27 SEQ ID NO:88. 


1346 


96 


280 


AAU27674 


Homo 
sapiens 


ZYMO Human protein AFP669232. 


1334 


95 


280 


AAB34813 


Homo 
sapiens 


HUMA- Human secreted protein sequence 
encoded by gene 4 1 SEQ ID NO: 1 01 . 


701 


93 


281 


ABB89737 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2113. 


614 


87 


281 


AAG89173 


Homo 
sapiens 


GEST Human secreted protein, SEQ ID NO: 
293. 


614 


87 


281 


AAM25811 


Homo 
sapiens 


HYSE- Human protein sequence SEQ ID 
NO: 1326. 


614 


87 


282 


AAW61622 


Homo 
sapiens 


HUMA- Clone HTPBA27 of TM4SF 
superfamily. 


841 


93 


282 


gi2997747 


Homo 
sapiens 


tetraspan TM4SF; Tspan-4 


841 


93 


282 


gi2586350 ; 


Homo 
sapiens 


tetraspan 


841 


93 


283 


gi 15080477 


Homo 
sapiens 


Similar to RIKEN cDNA 231O010G13 gene 


2034 


97 


283 


gi!7512422 


Mus | 
musculus 


Similar to RIKEN cDNA 2310010G13 gene 


1577 


76 


283 


gil7427162 


Ralstonia 

solanaceam 

m 


TRANSPORT TRANSMEMBRANE 
PROTEIN 


315 


28 


284 


ABB05645 


Homo 
sapiens 


BODE- Human thyroglobulin 38 protein 
SEQ ID NO:2. 


1858 


100 


284 


ABB05646 


Homo 
sapiens 


BODE- Human thyroglobulin 38 protein N- 
terminal peptide SEQ ID NO:7. 


88 


100 


284 


gi2 1322795 


Corynebacte 
rium 

gluiamicum 

ATCC 

13032 


ABC-type transporter, permease 
components 


78 


22 


285 


gil8157547 


Mus 

musculus 


pecanex-like 3 


1791 


93 


285 


gi!5076843 


Homo 


pecanex-like protein 1 


871 


34 
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ID 


HitID 
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Description 


S 

score 


Percent 
identity 






sapiens 








285 


AAM42412 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
145. 


743 


100 


286 


gi 17390957 


Mus 

musculus 


Similar to RIKEN cDNA 2010001E1 1 gene 


184 


26 


286 


gi2650264 


Archaeoglob 
us fulgidus 


oxa late/formate antiporter (oxlT-2) 


95 


22 


286 


gi 197 12705 


Fusobacteriu 

m nucleatum 

subsp. 

nucleatum 

ATCC 

25586 


Multidrug resistance protein 2 


94 


18 


287 


AAW27484 


Homo 
sapiens 


IMUT- Human MCP. 


1991 


96 


287 


gil80137 


Homo 
sapiens 


membrane cofactor protein 


1991 


96 


287 


AAR93939 


Homo 
sapiens 


AUST- CD46 wild-type. 


1986 


96 


288 


AAE01687 


Homo 
sapiens 


HUMA- Human gene 16 encoded secreted 
protein HDPMM88, SEQ ID NO:99. 


1019 


100 


288 


AA014187 


Homo 
sapiens 


INCY- Human transporter and ion channel 
TRICH-4. 


560 


58 


288 


gi20988041 


Homo 
sapiens 


Similar to ATPase, Class I, type 8B, 
member 2 


560 


58 


289 


AAG81436 


Homo 
sapiens 


ZYMO Human AFP protein sequence SEQ 
IDNO:390. 


392 


100 


289 


AAG74872 


Homo 
sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:5636. 


392 


100 


289 


AAB08863 


Homo 
sapiens 


INCY- Amino acid sequence of a human 
secretory protein. 


392 


100 


290 


gi 1226246 


Homo 
sapiens 


mono-ADP-ribosyltransferase 


1880 


94 


290 


gi2677616 


Mus 

musculus 


NAD(PX+)-arginine ADP- 
ribosyitransferase 


1142 


60 


290 


gi20067374 


Mus 

musculus 


ART3 mono{ADP-ribosy))transferase 


1071 


58 


291 


AAB70690 


Homo 
sapiens 


SREN- Human hDPP protein sequence SEQ 
IDNO:7. 


598 


100 


291 


AAG89279 


Homo 
sapiens 


GEST Human secreted protein, SEQ ID NO: 
399. 


598 


100 


291 


gi 13 182757 


Homo 
sapiens 


HTPAP ! 


598 


100 


292 


AAU83599 


Homo 
sapiens 


GETH Human PRO protein, Seq ID No 16. 


760 


100 


292 


A A nOO/4 1 o 

AAB88415 


Homo 
sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0181. 


725 


100 


292 


ABK09980_ 
aal 


Homo 
sapiens 


JAKO/ Human prostate stem cell antigen 
(PSCA) cDNA sequence. 


101 


32 


293 


gil2718841 


Mus 

musculus 


Skullin 


279 


38 | 


293 


gi4191356 


Mus 

musculus 


claudin-6 


277 


38 


293 


git 3543081 


Mus 

musculus 


ctaudin 6 


277 


38 
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S 

score 


Percent 
identity 


294 


ABB50276 


Homo 
sapiens 


USSH HLA-DR alpha chain ovarian rumour 
marker protein, SEQ ID NO:4 1 . 


1214 


92 


294 


AAB58160 


Homo 
sapiens 


ROSE/ Lung cancer associated polypeptide 
sequence SEQ ID 498. 


1214 


92 


294 


gil5929084 


Homo 
sapiens 


major histocompatibility complex, class II, 
DR alpha 


1214 


92 


295 


AAE15283 


Homo 
sapiens 


INCY- Human RNA metabolism protein-46 
(RMEP-46). 


2777 


99 


295 


gil6768810 


Drosophila 

melanogaste 

r 


LD05247p 


1133 


46 


295 


gi!6!85327 


Drosophila 

melanogaste 

r 


LD38433p 


906 


40 


296 


gil2620132 


Homo 
sapiens 


renal sodium/sulfate cotransporter 


3100 


100 


296 


gi469555 


Ratals 
norvegicus 


Na/Sulfate cotransporter 


2627 


82 


296 


gi310183 


Rattus 
norvegicus 


sodium dependent sulfate transporter 


2627 


82 


297 


AAY44245 


Homo 
sapiens 


INCY- Human cell signalling protein-8. 


1522 


89 


297 


AAE06590 


Homo 
sapiens 


SAGA Human protein having hydrophobic 
domain, HP 10785. 


1327 


80 


297 


AAM93721 


Homo 
sapiens 


HELI- Human polypeptide, SEQ ID NO: 
3671. 


1205 


99 


298 


AAE13277 


Homo 
sapiens 


INCY- Human transporters and ion channels 
(TRICHH- 


3306 


92 


298 


AAD06381_ 
aal 


Homo 
sapiens 


ACTI- Human ATP binding cassette, 
ABCB9 transporter cDNA. 


2338 


99 


298 


AAE02437 


Homo 
sapiens 


ACTI- Human ATP binding cassette, 
ABCB9 transporter protein. 


2338 


99 


299 


gi20072551 


Mus 

musculus 


RIKEN cDNA 49305 1 1 Jl 1 gene 


342 


40 


299 


gi 17974542 


Homo 
sapiens 


voltage-dependent calcium channel gamma- 
8 subunit 


118 


25 


299 


gil3357l80 


Homo 
sapiens 


calcium channel gamma subunit 8 


117 


25 


300 


gi20258606 


Homo 
sapiens 


sideroflexin 5 


1178 


100 


300 


gi3874886 


Caenorhabdi 
tis elegans 


C41C4.2 


592 


46 


300 


gi 13543 138 


Mus 

musculus 


RDCEN cDNA 281 0002005 gene 


401 


38 


301 


AAE07054 


Homo 
sapiens 


HUMA- Human gene 4 encoded secreted 
protein rio i ArJlD, dtv^ LU INU: / 1 . 


612 


29 


301 


AAE07077 


Homo 
sapiens 


HUMA- Human gene 4 encoded secreted 
protein HSYAB05, SEQ ID NO:94. j 


143 


23 


301 


gi9964007 


Homo 
sapiens 


MAB21L2 protein 


105 


33 


302 


ABB89405 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
1781. 


1337 


98 


302 


gil5030135 


Mus 

musculus 


RDCEN cDN A 1 1 1 0020A09 gene 


769 


60 


302 


gil6767870 


Drosophila 


GH02466p ; 


284 


36 
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melanogaste 
r 








303 


A A 1^ 4 ^ A A 

AAE 13349 


Homo 
sapiens 


pr\TA II TO*T»A — ~ \ AZC A1CTV 

SENO- Human TSTP protein, 165-015D. 


1652 


1 AA 

100 


303 


AAE13348 


Homo 
sapiens 


orviA II TOTn \ it C At cr 1 

SENO- Human TSTP protein, 165-015C. 


con 

589 


A A 

40 


303 


AAE13350 


Homo 
sapiens 


SENO- Human TSTP protein, 1 65-01 5E. 


314 


31 


304 


ABB89737 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2113. 


489 


100 


304 


AAG89173 


Homo 
sapiens 


GEST Human secreted protein, SEQ ID NO: 
293. 


489 


100 


304 


AAM258H 


Homo 
sapiens 


HYSE- Human protein sequence SEQ ID 
NO: 1326. 


489 


100 


305 


gi 1 6648454 


Drosophila 

melanogaste 

r 


SD01285p 


290 


30 


305 


AAY87336 


Homo 
sapiens 


INCY- Human signal peptide containing 
protein HSPP-1 13 SEQ ID NO:l 13. 


222 


28 


305 


gi4877582 


Homo 
sapiens 


lipoma HMGIC fusion partner 


222 


28 


306 


AAE14439 


Homo 
sapiens 


INCY- Human drug metabolising enzyme 
(DME)-2. 


1123 


98 


306 


ABB84932 


Homo 
sapiens 


GETH Human PR03579 protein sequence 
SEQ ID NO:232. 


1123 


98 


306 


AAB87576 


Homo 
sapiens 


GETH Human PR03579. 


1123 


98 


307 


gi 18857903 


Homo 
sapiens 


TCBA1 


867 


100 


307 


AAG78000 


Homo 
sapiens 


BIOW- Human aenn 14. 


663 


1 AA 

100 


307 


ABB89045 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
1421. 


AZ A A 

644 


AO 

98 


308 


gi4580997 


Mus 

muse ul us 


cAMP inducible 2 protein 


2377 


87 


308 


gi 18676548 


Homo 
sapiens 


VT TAA til » • 

FLJ00171 protein 


1877 


1 AA 

100 


308 


gi20073163 


Mus 

musculus 


Similar to solute carrier family 37 (glycerol- 
3 -phosphate transporter), member 1 


1572 


60 


309 


AAG71797 


Homo 
sapiens 


YEDA Human olfactory receptor 
polypeptide, SEQ ID NO: 1478. 


755 


1 AA 

100 


309 


AAG66336 | 


Homo 
sapiens 


CURA- Human NOV 16 protein sequence. 


755 


100 


309 


AAU24615 


Homo 
sapiens 


nrv ! yv ii if . . 

SENO- Human olfactory receptor 
AOLFR108. 


755 


100 


ii i 
i 


A A CA1 icn 
AAoU i Z5U_ 

aal 


Homo 
sapiens 


j/\rt^ nuiTian aipna nicotinic acecyicnonne 
receptor cDNA sequence. 




10ft 


311 


AAD27812_ 
aal 


Homo 
sapiens 


GLAX Human nicotinic acetylcholine 
receptor gene, sbg471005nAChR. 


2370 


100 


311 


AAEI7317 


Homo 
sapiens 


GLAX Human nicotinic acetylcholine 
receptor protein, sbg471005nAChR. 


2370 


100 


312 


gi21518639 


Homo 
sapiens 


TSLCl-like2 


1991 


97 


312 


gil9068139 


Mus 

musculus 


membrane glycoprotein 


1970 


96 
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312 


AAM78418 


Homo 
sapiens 


HYSE- Human protein SEQ ID NO 1080. 


1905 


97 


313 


AAG67512 


Homo 
sapiens 


SMIK Amino acid sequence of a human 
secreted polypeptide. 


3994 


100 


313 


AAH78215_ 
aal 


Homo 
sapiens 


SMIK Nucleotide sequence of a human 
secreted polypeptide. 


1659 


57 


313 


AAG67523 


Homo 
sapiens 


SMIK Amino acid sequence of a human 
secreted polypeptide. 


1659 


57 


314 


ABB90749 


Homo 
sapiens 


UYJO Human Tumour Endothelial Marker 
polypeptide SEQ ID NO 230. 


2691 


100 


314 


ABB90723 


Homo 
sapiens 


UYJO Human Tumour Endothelial Marker 
polypeptide SEQ ID NO 179. 


2691 


100 


314 


gi 15987487 


Homo 
sapiens 


tumor endothelial marker 3 precursor 


2691 


100 


315 


ABB90749 


Homo 
sapiens 


UYJO Human Tumour Endothelial Marker 
polypeptide SEQ ID NO 230. 


2600 


97 


315 


ABB90723 


Homo 
sapiens 


UYJO Human Tumour Endothelial Marker 
polypeptide SEQ ID NO 179. 


2600 


97 


315 


gil5987487 


Homo 
sapiens 


tumor endothelial marker 3 precursor 


2600 


97 


316 


AAG66705 


Homo 
sapiens 


CURA- Human GPCR3 polypeptide. 


1494 


100 


316 


AAG71567 


Homo 
sapiens 


YEDA Human olfactory receptor 
polypeptide, SEQ ID NO: 1248. 


1414 


100 


316 


gil 8480740 


Mas 

musculus 


olfactory receptor MOR267-14 


1017 


67 


317 


AAU83597 


Homo 
sapiens 


GETH Human PRO protein, Seq ID No 12. 


690 


31 


317 


ABB 10293 


Homo 
sapiens 


HUMA- Human cDNA SEQ ID NO: 601 . 


651 


100 


317 


ABB 10483 


Homo 
sapiens 


HUMA- Human cDNA SEQ ID NO: 79 1 . 


642 


99 


318 


gil 0944274 


Homo 
sapiens 


bA346K17.2 (A novel protein similar to the 
cell division control protein 91 (CDC91, 
YLR459W or L9 122.2) from Yeast) 


2235 


100 


318 


gi20988986 


Homo 
sapiens 


CDC91 cell division cycle 9 Mike 1 (S. 
cerevisiae) 


2235 


100 


318 


AAB88430 


Homo 
sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0205. 


2226 


99 


319 


AAY19506 


Homo 
sapiens 


HUMA- Amino acid sequence of a human 
secreted protein. 


1120 


100 


319 


gi|17540010| 
reflNP_5030 
66.1| 


Caenorhabdi 
tis elegans 


F26D10.11.p 


83 


28 


319 


gi|14149748| 
retJNP 0683 
65.11 


Mus 

musculus 


claudin 15 


72 


20 


320 


gi784990 


Homo 
sapiens 


5-HT5A serotonin receptor 


1645 


100 


320 


gi20379144 


Homo 
sapiens 


5-hydroxytryptamine receptor 5A 


1645 


100 


320 


AAR45848 


Homo 
sapiens 


INRM Human 5HT5a serotonin receptor. 


1611 


98 


321 


AAS07947_ 
aal 


Homo 
sapiens 


AREN- Human cDNA encoding G-protein 
coupled receptor, hRUP20. 


1734 


100 
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321 


AAD13260_ 
aal 


Homo 
sapiens 


MILL- Human 39406 cDNA. 


1734 


100 


321 


AAM50774 


Homo 
sapiens 


INGE- Human G protein coupled receptor 
IGPcR20. 


1734 


100 


322 


AAY25806 


Homo 
sapiens 


HUMA- Human secreted protein fragment 
encoded from gene 23. 


1663 


98 


322 


gil9528215 


Drosophila 

melanogaste 

r 


AT30101p 


1012 


38 


322 


AAM93717 


Homo 
sapiens 


HELI- Human polypeptide, SEQ ID NO: 
3663. 


1011 


100 


323 


AAB12119 


Homo 
sapiens 


PROT- Hydrophobic domain protein from 
clone HP02869 isolated from KB cells. 


448 


100 


323 


gi4827164 


Gluconaceto 

bacter 

xylinus 


similar to melibiose carrier protein of E.coli 


89 


26 


323 


gi595475 


Homo 
sapiens 


hFcRn 


84 


31 


324 


AAY25736 


Homo 
sapiens 


HUMA- Human secreted protein encoded 
from gene 26. 


343 


100 


325 


AAB44336 


Homo 
sapiens 


HUMA- Human secreted protein encoded 
by gene 2 clone HROAM1 1. 


169 


100 


325 


gi| 12045265| 
ref]NPJ)730 
76.1| 


Mycoplasma 
genitalium 


ATP synthase F0, subunit B (atpF) 


65 


44 


325 


gi| 18447301) 

gb|AAL682 

25.1| 


Drosophila 

melanogaste 

r 


LD26265p 


65 


31 


326 


gi 14278927 


Mus 

musculus 


gliacolin 


1291 


94 


326 


gil0566471 


Mus 

musculus 


Gliacolin 


1291 


94 


326 


gi3747097 


Homo 
sapiens 


Clq-related factor 


976 


70 


327 


gil3506225 


Mus 

musculus 


ST7 protein forml splice variant a 


2996 


99 


327 


gi 19353275 


Mus 

musculus 


Similar to suppression of tumorigenicity 7 


2940 


98 


327 j 


gi9230665 


Homo 
sapiens 


FAM4A1 splice variant a 


2857 


95 


328 


gi9230665 


Homo 
sapiens 


FAM4A 1 splice variant a 


2709 


94 


328 


gi 13506227 


Mus 

musculus 


ST7 protein forml splice variant b 


2702 


94 


328 


gi 13506225 


Mus 

musculus 


ST7 protein forml splice variant a 


2668 


90 


329 


gi9230667 


Homo 
sapiens 


FAM4A1 splice variant b 


2859 


99 


329 


gil3506225 


Mus 

musculus 


ST7 protein forml splice variant a 


2848 


96 


329 


gil9353275 


Mus 

musculus 


Similar to suppression of tumorigenicity 7 


2792 


95 


330 


AAU19222 


Homo 
sapiens 


PHAA Human G protein-coupled receptor 
nGPCR-2343. 


467 


100 


330 


AAV25491 


Homo 


BGHM cDNA for Epstein Barr virus 


317 


38 
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score 
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aal 


sapiens 


induced gene 2 (EBI-2). 






330 


AAY90630 


Homo 
sapiens 


AREN- Human G protein-coupled receptor 
EBI2. 


317 


38 


331 


AAB94231 


Homo 
sapiens 


HE LI- Human protein sequence SEQ ID 
NO: 14604. 


3584 


99 


331 


AAB95784 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 18737. 


3570 


100 


331 


gi 10880791 


Homo 
sapiens 


PP791 protein 


3329 


99 


332 


AAY23325 


Homo 
sapiens 


GETH A33 related antigen JAM. 


105 


27 


332 


gi3462455 


Mus 

musculus 


junctional adhesion molecule 


105 


27 


332 


gi8650528 


Rattus 
norvegicus 


junctional adhesion molecule JAM 


98 


26 


333 


AAG93279 


Homo 
sapiens 


NISC- Human protein HP03145. 


1977 


99 


333 


gil4250676 


Homo 
sapiens 


Similar to RIKEN cDNA 2310002F18 gene 


1977 


99 


333 


AAY27589 


Homo 
sapiens 


HUMA- Human secreted protein encoded 
by gene No. 23. 


1578 


100 


334 


gi953239 


Homo 
sapiens 


tetraspan membrane protein 


996 


91 


334 


gi 12655071 


Homo 
sapiens 


transmembrane 4 superfamily member 4 


996 


91 


334 


gi!1493837 


Rattus 
norvegicus 


tetraspan protein LRTM4 


911 


81 


335 


AAB94238 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 14621. 


3039 


99 


335 


AAB87342 


Homo 
sapiens 


HUMA- Human gene 1 encoded secreted 
protein HETHR73, SEQ ID NO:83. 


3033 


99 


335 


AAU23815 


Homo 
sapiens 


UROG- Human prostate-related gene 
1 03P2D6 encoded protein. 


3016 


99 


336 


gi 14336694 


Homo 
sapiens 


M83 


4100 


99 


336 


gi 18204292 


Homo 
sapiens 


transmembrane protein 8 (five membrane- 
spanning domains) 


4096 


99 


336 


gil0716072 


Homo 
sapiens 


M83 protein 


4089 


99 


337 


AAD02700_ 
aal 


Homo 
sapiens 


REGC Human glycosyl sulfo trans ferase- 
4beta (GST-4beta) cDNA. 


2056 


100 


337 


AAE 15438 


Homo 
sapiens 


INCY- Human drug metabolising enzyme 
(DME)-5. 


2056 


100 


337 


AAY72640 


Homo 
sapiens 


REGC Human glycosyl sulfo trans ferase- 
4beta (GST^beta). 


2056 


100 




AAbozV/I 


Homo 
sapiens 


MILL- G protein coupled receptor 43238. 


1631 


99 


338 


gi 18480770 


Mus 

musculus 


olfactory receptor MOR271-1 


1373 


83 


338 


gil 8479336 


Mus 

musculus 


olfactory receptor MOR270-1 


1367 


83 


339 


AAB82971 


Homo 
sapiens 


MILL- G protein coupled receptor 43238. 


1562 


99 


339 


gil8479336 


Mus 

musculus 


olfactory receptor MOR270-1 


1338 


85 
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339 


• % OA OATJA 

gi 18480770 


Mus 

musculus 


olfactory receptor MOR271-1 


1336 


Oil 

84 


340 


gi7960136 


Homo 
sapiens 


neuroligin 3 isoform 


A CM 

4557 


100 


340 


gi 1145791 


Rattus 
norvegicus 


neuroligin 3 


4505 


98 


340 


gi7960135 


Homo 
sapiens 


neuroligin 3 isoform 


4419 


97 


341 


ABB07253 


Homo 
sapiens 


LEXI- Human novel GPCR (NGPCR) 
protein. 


3943 


99 


341 


AAM69607 


Homo 
sapiens 


MOLE- Human bone marrow expressed 
probe encoded protein SEQ ID NO: 29913. 


1770 


82 


341 


AAM57201 


Homo 
sapiens 


MOLE- Human brain expressed single exon 
probe encoded protein SEQ ID NO: 29306. 


1770 


82 


342 


AAG72315 


Homo 
sapiens 


YEDA Human olfactory receptor 
polypeptide, SEQ ID NO: 1996. 


1140 


76 


342 


AAE18020 


Homo 
sapiens 


CURA- Human G -protein coupled receptor- 
7 (GPCR-7) protein. 


915 


96 


342 


AAU24629 


Homo 
sapiens 


SENO- Human olfactory receptor 
AOLFR123. 


859 


89 


343 


AAB95124 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO:17122. 


1552 


81 


343 


gi854065 


Human 

herpesvirus 

6 


U88 


802 


46 


343 


AAM40934 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
5865. 


435 


36 


344 


AAG71823 


Homo 
sapiens 


YEDA Human olfactory receptor 
polypeptide, SEQ ID NO: 1504. 


1627 


100 


344 


AAU24669 


Homo 
sapiens 


SENO- Human olfactory receptor 
AOLFR167. 


1627 


100 


344 


AAEI1910 


Homo 
sapiens 


CURA- Human G-protein coupled receptor 
15a(GPCR15a) protein. 


1627 


100 


1 A C 

345 


AAU00437 


Homo 
sapiens 


COUN- Human dendritic cell membrane 
protein FIRE. 


2867 


88 


345 


A A \/C\ 1 K^C 


Homo 
sapiens 


HUM A- Human secreted protein sequence 
encoded by gene 22 SEQ ID NO:298. 


1966 


97 


1A C 

343 


gi 1 0930385 


Mus 

musculus 


seven-span membrane protein FIRE 


1838 


55 


1A< 

340 


A AT 

AAUUU43 / 


Homo 
sapiens 


CUUN- Human dendritic cell membrane 
protein FIRE. 


2341 


87 


346 


AAY91625 


Homo 
sapiens 


HUMA- Human secreted protein sequence 
encoded by gene 22 SEQ ID NO: 298. 


1966 


97 


346 


gil6930385 


Mus 

musculus 


seven-span membrane protein FIRE 


1535 


59 


"1/17 


ARR04H47 


Homo 
sapiens 


WT WA A Unman CArroto^ nrAtain OCH TT"\ 

nuivi/v- numan secreieu protein ocv^ iu 
NO: 90. 


QA 

54 


31 


347 


ABB94023 


Homo 
sapiens 


HUMA- Human secreted protein SEQ ID 
NO: 66. 


84 


31 


347 


gi|2 1288752| 

gb|EAA010 

45.1| 


Anopheles 
gambiae str. 
PEST 


ebiP7790 


537 


34 


348 


AAW75000 


Homo 
sapiens 


HUMA- Human secreted protein encoded 
by gene 146 clone HSNAK17. 


349 


100 


348 


ABB03792 


Homo 


HUMA- Human musculoskeletal system 1 


70 


28 
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S 

score 
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identity 






sapiens 


related polypeptide SEQ ID NO 1739. 






348 


gi| 175428421 
ref]NP 5003 
10.1| 


Caenorhabdi 
tis elegans 


W08E12.8.p 


69 


39 


349 


gil 9684 136 


Homo 
sapiens 


Similar to RIKEN cDNA 49334 13N 12 gene 


178 


26 


349 


gi841378 


Saccharomy 
ces 

cerevisiae 


Gpi2p 


90 


30 


349 


gi295139 


Staphylococ 
cus 

lugdunensis 


ORFB 


79 


31 


350 


AAB88406 


Homo 
sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0162. 


1421 


99 


350 


ABB50346 


Homo 
sapiens 


HUMA- Human secreted protein encoded 
by gene 46 SEQ ID NO:294. 


476 


95 


350 


AAW88579 


Homo 
sapiens 


HUMA- Secreted protein encoded by gene 
46 clone HCFMV39. 


476 


95 


351 


gi292793 


Homo 
sapiens 


T-cell receptor beta 


636 


98 


351 


AAM76093 


Homo 
sapiens 


MOLE- Human bone marrow expressed 
probe encoded protein SEQ ID NO: 36399. 


594 


93 


351 


AAM63281 


Homo 
sapiens 


MOLE- Human brain expressed single exon 
probe encoded protein SEQ ID NO: 35386. 


594 


93 


352 


AAY10839 


Homo 
sapiens 


HUMA- Amino acid sequence of a human 
secreted protein. 


225 


95 


353 


AAY16784 


Homo 
sapiens 


GEMY Human secreted protein (clone 
colOOO 1). 


488 


100 


353 


gil 850866 


Macropus 
robustus 


ATPase subunit 8 


69 


31 


353 


gi2935032 


Rhodococcu 
s opacus 


ClcR 


68 


42 


354 


gi|2 1293186) 

gb|EAA053 

31.11 


Anopheles 
gambiae str. 
PEST 


agCP9246 


71 


26 


355 


AAA40083_ 
aal 


Homo 
sapiens 


KAZU- Human brain-specific 
transmembrane glycoprotein encoding 
cDNA. 


1553 


51 


355 


AAB 12448 


Homo 
sapiens 


CHUG- Human hh00149 protein SEQ ID 
NO:4. 


1553 


51 


355 


AAB09968 


Homo 
sapiens 


KAZU- Human brain-specific 
transmembrane glycoprotein. 


1553 


51 


356 


AAB50953 


Homo 
sapiens 


GETH Human PR0534 protein. 


1760 


95 


356 


AAB73689 


Homo 
sapiens 


INCY- Human oxidoreductase protein ORP- 


1760 


95 


356 


AAB44303 


Homo 
sapiens 


GETH Human PR0534 (UNQ335) protein 
sequence SEQ ID NO:410. 


1760 


95 


357 


gil2276180 


Homo 
sapiens 


metalloprotease-disintegrin meltrin beta 


5255 


99 


357 


AAE19181 


Homo 
sapiens 


INCY- Human protease, PRTS-1 8 protein. 


4967 


99 


357 


gil 2802370 


Homo 
sapiens 


disintegrin and metalloproteinase ADAM 19 


4967 


99 


358 


gil 8056675 


Homo 


FREB 


1969 


98 
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sapiens 








358 


gi2 1245 136 


Homo 
sapiens 


FCRLal 


1940 


99 


358 


AAE03451 


Homo 
sapiens 


HUM A- Human gene 25 encoded secreted 
protein HRGBL78, SEQ ID NO: 134. 


1888 


98 i 


359 


gi 18056675 


Homo 
sapiens 


FREB 


1986 


99 


359 


AAE03451 


Homo 
sapiens 


HUM A- Human gene 25 encoded secreted 
protein HRGBL78, SEQ ID NO: 134. 


1905 


99 


359 


AAB34744 


Homo 
sapiens 


ALPH- Human secreted protein encoded by 
DNA clone vq24 1. 


1905 


99 


360 


AAW74807 


Homo 
sapiens 


HUMA- Human secreted protein encoded 
by gene 79 clone HSKNE46. 


270 


100 


360 


AAO02082 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
15974. 


69 


41 


360 


AAB34697 


Homo 
sapiens 


ALPH- Human secreted protein encoded by 
DNA clone vq6 1. 


66 


45 


361 


gil7861418 


Drosophila 

melanogaste 

r 


GH03649p 


226 


35 


361 


gi6959684 


Mus 

musculus 


glycolipid transfer protein 


95 


24 


361 


gil6741551 


Mus 

musculus 


Similar to glycolipid transfer protein 


95 


24 


362 


AAE06578 


Homo 
sapiens 


SAGA Human protein having hydrophobic 
domain, HP 10769. 


2337 


100 


362 


gil3623231 


Homo 
sapiens 


Similar to RIKEN cDNA 120001 3 A08 gene 


2337 


100 


362 


AAB92464 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 10520. 


2272 


98 


363 


AAU12211 


Homo 
sapiens 


GETH Human PR01886 polypeptide 
sequence. 


1639 


99 


363 


gi| 17542564| 
ref]NP_5014 
34.1| 


Caenorhabdi 
Us elegans 


T26A8.2.p 


189 


21 


363 


gi|21298000| 

gb|EAA101 

45.1| 


Anopheles 
gambiae str. 
PEST 


agCP 15426 


127 


18 


364 


ABB05715 


Homo 
sapiens 


GEHU- Human transmembrane protein 
clone tes3 17i21. 


1237 


100 


364 


AAU27674 


Homo 

sapiens 


ZYMO Human protein AFP669232. 


649 


48 


364 


AAB24463 


Homo 
sapiens 


HUMA- Human secreted protein sequence 
encoded by gene 27 SEQ ID NO:88. 


648 


48 


365 


gi 14582572 


Homo 
sapiens 


orphan transporter SLC19A3 


2549 


100 


365 


gi 1*2483888 


Homo 
sapiens 


solute carrier 19A3 


2549 


100 


365 


gi 12483890 


Mus 

musculus 


solute carrier 19A3 


1713 


68 


366 


AAM41254 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
6185. ! 


632 


90 


366 


ABB 11854 


Homo 
sapiens 


HYSE- Human secreted protein homologue, 
SEQ IDNO:2224. 


632 


90 


366 


ABB89257 


Homo 


HUMA- Human polypeptide SEQ ID NO 


631 


99 
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sapiens 


1633. 






367 


AAB94138 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 14406. 


2598 


100 


367 


gil5866720 


Homo 
sapiens 


fiiku tin-related protein 


2598 


100 


367 


gi 17945 162 


Drosophila 

melanogaste 

r 


RE09574p 


354 


23 


368 


AAE 14448 


Homo 
sapiens 


INCY- Human drug metabolising enzyme 
(DME)-ll. 


2002 


99 


368 


AAB85780 


Homo 
sapiens 


INCY- Human drug metabolizing enzyme 
(ID No. 725611 6CD1). 


1797 


98 


368 


gi45 19535 


Homo 
sapiens 


Leukotriene B4 omega-hydroxylase 


1222 


64 


369 


gil8157547 


Mus 

musculus 


pecanex-like 3 


1809 


95 


369 


gi 15076843 


Homo 
sapiens 


pecanex-like protein 1 


872 


34 


369 


AAM42412 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
145. 


743 


100 


370 


AAB61219 


Homo 
sapiens 


MILL- Human TANGO 292 protein. 


1201 


100 


370 


gi 14603 178 


Homo 
sapiens 


transmembrane gamma -carboxy glutamic 
acid protein 4 


1201 


100 


370 


gi 12656635 


Homo 
sapiens 


transmembrane garnma-carboxyglutarnic 
acid protein 4 TMG4 


1201 


100 


371 


AAM40584 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
5515. 


2045 




371 


ABB 10286 


Homo 

sapiens j 


HUMA- Human cDNA SEQ ID NO: 594. 


2045 


95 


371 


ABB 10269 


Homo 
sapiens 


HUMA- Human cDNA SEQ ID NO: 577. 


2045 


95 


372 


gil510143 


Homo 
sapiens 


similar to C.elegans protein encoded in 
cosmid T20D3 (Z68220). 


1624 


55 


372 


ABB89I28 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
1504. 


1359 


98 


372 


AAY53635 


Homo 
sapiens 


CHIR A bone marrow secreted protein 
designated BMS53. 


1148 


51 


373 


A A ¥"\ f\ ~\ AAA 

AAB93444 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 12686. 


1006 


87 


373 


ABB 895 62 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
1938. 


998 


86 


373 


gil 5209353 


Caenorhabdi 
tis elegans 


Y39B6A.1 


138 


45 


374 


AAM06271 


Homo 
sapiens 


HYSE- Human foetal protein, SEQ ID NO: 

z. 


426 


98 


374 


gil90203 


Homo 
sapiens 


potassium channel 


76 


32 


374 


gil 01 76968 


Arabidopsis 
thai i ana 


receptor-like protein kinase 


76 


31 


375 


gi5542014 


Homo 
sapiens 


dyskerin 


2616 


91 


375 


AAY33675 


Homo 
sapiens 


DEKR- Human DKC1 protein. 


2549 


90 


375 


gi3.135028 


Homo 


dyskerin 


2549 


90 
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sapiens 








376 


gi5542014 


Homo 
sapiens 


dyskerin 


2492 


94 


376 


AAY33675 


Homo 
sapiens 


DEKR- Human DKC1 protein. 


2425 


92 


376 


gi3 135028 


Homo 
sapiens 


dyskerin 


2425 


92 


377 


gil763011 


Homo 
sapiens 


lysophospholipase homolog 


1444 


90 


377 


gil3623261 


Homo 
sapiens 


lysophospholipase-like 


1444 


90 


377 


gi 14594904 


Homo 
sapiens 


monoglyceride lipase 


1390 


90 


378 


gil763011 


Homo 
sapiens 


lysophospholipase homolog 


1589 


92 


378 


gil3623261 


Homo 
sapiens 


lysophospholipase-like 


1589 


92 


378 


gil4594904 


Homo 
sapiens 


monoglyceride lipase 


1535 


92 


379 


ABB90I65 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2541. 


571 


93 


379 


AAY94946 


Homo 
sapiens 


GEMY Human secreted protein clone 
cd205 2 protein sequence SEQ ID NO:98. 


571 


93 


379 


AAY53051 


Homo 
sapiens 


GEMY Human secreted protein clone 

ddl 19 4 protein sequence SEQ ID NO:108. 


318 


59 


380 


AAM93503 


Homo 
sapiens 


HELI- Human polypeptide, SEQ ID NO: 
3213. 


1082 


92 


380 


AAY77122 


Homo 
sapiens 


1NCY- Human neurotransmission-associated 
protein (NTAP) 414692. 


1082 


92 


380 


gi6523817 


Homo 
sapiens 


SIR protein 


1082 


92 


381 


AAE07124 


Homo 
sapiens 


HUMA- Human gene 16 encoded secreted 
protein fragment, SEQ ID NO: 141. 


931 


91 


381 


AAE07099 


Homo 
sapiens 


HUMA- Human secreted protein, SEQ ID 
NO:116. 


931 


91 


381 


gi6980032 


Mus 

musculus 


ARL-6 interacting protein- 1 


907 


88 


382 


gi2 1430284 


Drosophila 

melanogaste 

r 


LD38689p 


1292 


40 


382 


AAM80289 


Homo 
sapiens 


HYSE- Human protein SEQ ID NO 3935. 


191 


30 


382 


AAM79305 


Homo 
sapiens 


HYSE- Human protein SEQ ID NO 1967. 


191 


30 


383 


AAG73684 


Homo 
sapiens 


HUMA- Human colon cancer antigen 

pPA IIA \FA. A A AO 

protein SEQ ID NO:4448. 


1863 


98 


383 


AAY48312 


Homo 
sapiens 


META- Human prostate cancer-associated 
protein 9. 


1509 


too 


383 


gi 17389322 


Homo 
sapiens 


Similar to NICE-5 protein 


1419 


74 


384 


AAB93185 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 12 134. 


2492 


100 


384 


AAM93581 


Homo 
sapiens 


HELI- Human polypeptide, SEQ ID NO: 
3373. 


1971 


96 


384 


AAE10328 


Homo 


INCY- Human transporter and ion channel-5 


1873 


100 



WO 03/025148 



PCT/US02/29964 



162 
Table 2B 



SEQ 
ID 


Hit ID 


Species 


Description 


S 

score 


Percent 
identity 






sapiens 


(TRICH-5) protein. 






385 


ABB89951 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2327. 


2862 


99 


385 


AAB58984 


Homo 
sapiens 


HUMA- Breast and ovarian cancer 
associated antigen protein sequence SEQ ID 
692. 


759 


94 


385 


ABB04610 


Homo 
sapiens 


BOD A- Human quinoprotein dehydrogenase 
33 protein SEQ ID NO:2. 


244 


27 


386 


ABB89951 


Homo 
sapiens 


HUMA^ Human polypeptide SEQ ID NO 

2327. 


2791 


98 


386 


AAB58984 


Homo 
sapiens 


HUMA- Breast and ovarian cancer 
associated antigen protein sequence SEQ ID 
692. 


688 


89 


386 


ABB04610 


Homo 
sapiens 


BOD A- Human quinoprotein dehydrogenase 
33 protein SEQ ID NO:2. 


251 


28 


387 


AAM93354 


Homo 
sapiens 


HELI- Human polypeptide, SEQ ID NO: 
2907. 


531 


100 


387 


AAM00917 


Homo 
sapiens 


HYSE- Human bone marrow protein, SEQ 
ID NO: 393. 


495 


99 


387 


gil8308220 


Xenopus 
laevis 


transmembrane protein quicken 


333 


77 


388 


AAU12232 


Homo 
sapiens 


GETH Human PR04398 polypeptide 
sequence. 


2696 


100 


388 


ABB90111 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2487. 


1784 


99 


388 


gi 14860862 


Homo 
sapiens 


polyamine oxidase isoform-1 


932 


39 


389 


AAM00947 


Homo 
sapiens 


HYSE- Human bone marrow protein, SEQ 
ID NO: 423. 


6659 


98 


389 


AAM00834 


Homo 
sapiens 


HYSE- Human bone marrow protein, SEQ 
ID NO: 197. 


4723 


100 


389 


AAY99666 


Homo 
sapiens 


INCY- Human GTPase associated protein- 
17. 


3647 


97 


390 


AAE17492 


Homo 
sapiens 


INCY- Human secretion and trafficking 
protein- 1 (SAT-1). 


1705 


100 


390 


gi 13529623 


Mus 

musculus 


Similar to RJKEN cDNA 49304 1 8P06 gene 


1408 


81 


390 


gi|21313292| 
refjNP 0840 
53.1| 


Mus 

musculus 


RIKEN cDNA 493041 8P06 


1401 


80 


391 


AAB36613 


Homo 
sapiens 


INCY- Human FLEXHT-35 protein 
sequence SEQ ID NO:35. 


1121 


85 


391 


gi!4603247 


Homo 
sapiens 


Similar to RIKEN cDNA 5730409G15 gene 


1121 


85 


391 


AAB93042 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 11827. 


240 


90 


392 


AAB82940 


Homo 
sapiens 


UYNY Human androgen receptor trapped 
protein 5 (ART5). 


299 


39 


392 


AAB56085 


Homo 
sapiens 


HUMA- Human secreted protein sequence 
encoded by gene 9 SEQ ID NO: 179. 


299 


39 


392 


gi 18043859 


Mus 

musculus 


Similar to RIKEN cDNA 9430098E02 gene 


251 


42 


393 


AAM39990 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
3135. 


1209 


70 


393 


AAM38999 


Homo 


HYSE- Human polypeptide SEQ ID NO 


1209 


70 
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sapiens 


2144. 






393 


AAB 18993 


Homo 
sapiens 


INCY- Amino acid sequence of a human 
transmembrane protein. 


1209 


70 


394 


gi4220892 


Homo 
sapiens 


transcriptional co-activator CRSP34 


919 


97 


394 


gi7141322 


Homo 
sapiens 


p37 TRAP/SMCC/PC2 summit 


918 


97 


394 


gi!6741439 


Mus 

musculus 


RIKEN cDNA 1 50001 5J03 gene 


918 


97 


395 


gilS25729 


Caenorhabdi 
tis elegans 


C. elegans PTR-2 protein (corresponding 
sequence C32E8.8) 


1024 


30 


395 


gi3880799 


Caenorhabdi 
tis elegans 


Y39A1B.2 


940 


29 


395 


gi 157 18594 


Caenorhabdi 
tis elegans 


C. elegans PTR-10 protein (corresponding 
sequence F55F8.1) 


818 


28 


396 


AAB20342 


Homo 
sapiens 


UYMC- Peroxisome proliferator-activated 
receptor alpha. 


2265 


94 


396 


AAR74053 


Homo 
sapiens 


LIGA- Human peroxisome proliferator 
activated receptor. 


2265 


94 


396 


gi765240 


Homo 
sapiens 


peroxisome proliferator activated receptor 
alpha; PPAR alpha 


2265 


94 


397 


ABB 11 934 


Homo 
sapiens 


HYSE- Human transmembrane protein 
homologue, SEQ ID NO:2304. 


1692 


100 


397 


AAB43983 


Homo 
sapiens 


HUMA- Human cancer associated protein 
sequence SEQ ID NO: 1428. 


1692 


100 


397 


AAH47123_> 
aal 


Homo 
sapiens 


NIGE- Human B 1 466 protein encoding 
cDNA. 


1409 


100 


398 


gil 9526687 


Mus 

musculus 


Na-H exchanger isoform NHE8 


2829 


96 


398 


gi5304871 


Homo 
sapiens 


dJ963K23.4 (continues in dJ1041C10 
(AL162615)) 


2236 


100 


398 


gil7862784 


Drosophila 

melanogaste 

r 


LP02993p 


1535 


55 


399 


AAB93258 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 12282. 


1617 


99 


399 


AAY28810 


Homo 
sapiens 


GEMY nn296_2 secreted protein. 


1617 ; 


99 


399 


ABB89196 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
1572. 


1319 


99 


400 


AAG00388 


Homo 
sapiens 


GEST Human secreted protein, SEQ ID NO: 
4469. 


316 


100 


401 


AAU21958 


Homo 
sapiens 


HUMA- Human cardiovascular system 
antigen polypeptide SEQ ID No 732. 


97 


26 


401 


gil814196 


Caenorhabdi 
tis elegans 


AO 13 ankyrin 


87 


31 


401 


gil91 10782 


Homo 
sapiens 


DNA helicase HEL308 


81 


25 


402 


gi2 1438549 


Homo 
sapiens 


humane cDNA 


2566 


99 


402 


gi2 1438547 


Rattus 
norvegicus 


Ratten cDNA 


2444 


93 


402 


gi2 1438551 


Mus 

musculus 


genomische DNA Exon I der Maus 


691 


91 


403 


AAE04759 


Homo 


INCY- Human vesicle trafficking protein-2 


1013 


100 
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sapiens 


(VETRP-2) protein. 






403 


AAB98207 


Homo 
sapiens 


SHAN- Human P24 protein-22 SEQ ID 
NO:2. 


1009 


99 


403 


gil61 18876 


Homo 
sapiens 


vesicular membrane protein P24 


1009 


99 


404 


ABB 14761 


Homo 
sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 3418. 


873 


95 


404 


AAU25439 


Homo 
sapiens 


INCY - Human mddt protein from clone 
LG:403872. 1 :2000MAY1 9. 


524 


38 


404 


AAU75787 


Homo 
sapiens 


INCY- Human protein phosphatase 5 (PP5) 
protein sequence. 


444 


36 


405 


AAM93259 


Homo 
sapiens 


HELI- Human polypeptide, SEQ ID NO: 
2709. 


1257 


100 


405 


gil 6877659 


Homo 
sapiens 


Similar to RUCEN cDNA 1810054013 gene 


1157 


98 


405 


AAG81420 


Homo 
sapiens 


ZYMO Human AFP protein sequence SEQ 
IDNO:358. 


137 


40 


406 


gil 22 14288 


Homo 
sapiens 


dJ402H5.2 (novel protein similar to worm 
and fly proteins) 


1397 


50 


406 


gi3880799 


Caenorhabdi 
tis elegans 


Y39A1B.2 


707 


25 


406 


gil 825729 


Caenorhabdi 
tis elegans 


C. elegans PTR-2 protein (corresponding 
sequence C32E8.8) 


602 


24 


407 


gi 19338984 


Homo 
sapiens 


fat cell-specific low molecular weight 
protein beta 


135 


44 


407 


gil9071802 


Homo 
sapiens 


fat cell-specific low molecular weight 
protein alpha 


135 


44 


407 


gi20380358 


Mus 

musculus 


RIKEN cDNA 1 1 10025G12 gene 


121 


31 


408 


ABB90225 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2601. 


952 


100 


408 


AAB12150 


Homo 
sapiens 


PROT- Hydrophobic domain protein 
isolated from HT-1080 cells. 


952 


100 


408 


ABB06157 


Homo 
sapiens 


COMP- Human NS protein sequence SEQ 
IDNO:249. 


944 


98 


409 


gil5074997 


Sinorhizobiu 
m meliloti 


CONSERVED HYPOTHETICAL 
PROTEIN 


96 


32 


409 


gi|20868002| 
ref]XP 1373 
98.1| 


Mus 

musculus 


similar to expressed sequence AW049604 


75 


28 


410 


AAY57279 


Homo 
sapiens 


YEDA Transcription factor subunit 
TAFII105 polypeptide. 


3902 


98 


410 


AAW31494 


Homo 
sapiens 


REGC Human hTAFH105 protein. 


3902 


98 


410 


gil 669689 


Homo 
sapiens 


TBP associated factor 


3902 


98 


411 


AAE04639 


Homo 
sapiens 


MILL- Human novel transmembrane 
protein, 32164 protein. 


1588 


98 


411 


AAE 18658 


Homo 
sapiens 


INCY- Human G-protein coupled receptor 
(GCREC-19). 


1548 


98 


411 


AAG71672 


Homo 
sapiens 


YEDA Human olfactory receptor 
polypeptide, SEQ ID NO: 1353. 


1202 


94 


412 


ABB 11 920 


Homo 
sapiens 


HYSE- Human adrenomedullin receptor 
homologue, SEQ ED NO:2290. 


1795 


95 


412 


AAY16630 


Homo 


SMIK Human Putative Adrenomedullin 


1789 


94 
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sapiens 


Receptor (PAR). 






412 


gi292419 


Homo 
sapiens 


orphan receptor 


1774 


93 


413 


AAY95002 


Homo 
sapiens 


ALPH- Human secreted protein vc34 1, 
SEQ ID NO:44. 


1027 


56 


413 


ABB 12222 


Homo 
sapiens 


HYSE- Human secreted protein homologue, 
SEQ ID NO:2592. 


697 


76 


413 


AAM95374 


Homo 
sapiens 


HUMA- Human reproductive system related 
antigen SEQ ID NO: 4032. 


477 


65 


414 


ABB89474 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
1850. 


1004 


98 


414 


AAB56877 


Homo 
sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO: 1455. 


1004 


98 


414 


gi 18044902 


Mus 

musculus 


Similar to RIKEN cDNA 31 10005G23 gene 


851 


65 


415 


gi!79165 


Homo 
sapiens 


Na,K-ATPase subumt alpha 2 


5238 


99 


415 


gi203029 


Rattus 
norvegicus 


(Na+ and K+) ATPase, alpha* catalytic 
subunit precursor 


5205 


98 


415 


gi2 12406 


Gallus 
gallus 


Na,K-ATPase alpha-2-subunit 


4977 


93 


416 


gil8606367 


Mus 

musculus 


RIKEN cDNA 4930570C03 gene 


715 


92 


416 


AAB90649 


Homo 
sapiens 


HUMA- Human secreted protein, SEQ ID 
NO: 192. 


562 


97 


416 


AAB90565 


Homo 
sapiens 


HUMA- Human secreted protein, SEQ ID 
NO: 103. 


472 


100 


417 


gil3512192 


Homo 
sapiens 


polycystic kidney and hepatic disease 1 


1871 


100 


417 


gi 178273 


Homo 
sapiens 


alanine:glyoxylate aminotransferase 


77 


26 


417 


gi28561 


Homo 
sapiens 


L- alanine:glyoxylate aminotransferase 


77 


26 ! 


418 


gi 13249295 


Homo 
sapiens 


anion exchanger AE4 


4951 


100 


A tO 

418 


gi7363254 


Homo 
sapiens 


sodium bicarbonate cotransporter 5 


4898 


98 


418 


gj 135 17508 


Homo 
sapiens 


sodium bicarbonate cotransporter 


4873 


95 


419 


gi2564913 


Homo 
sapiens 


metaxin 


1108 


82 


419 


gi 12804907 


Homo 
sapiens 


Similar to metaxin 1 


1100 


99 


419 


' or\i az w h\ 

gi807670 


Mus 

musculus 


metaxin 


995 


89 


*#ZU 




Homo 
sapiens 


metaxin 


100 J 


1 AA 

IUU 


420 


gil 8606009 


Mus 

musculus 


metaxin 


1528 


91 


420 


gil2804907 


Homo 
sapiens 


Similar to metaxin 1 


1470 


90 


421 


gi6094684 


Homo 
sapiens 


similar to Kelch proteins; similar to 
BAA77027 (PID:g4650844) 


694 


31 


421 


AAB93480 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 12768. 


630 


29 
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421 


AAU28187 


Homo 
sapiens 


HYSE- Novel human secretory protein, Seq 
ID No 356. 


628 


29 


422 


gi 147 15068 


Homo 
sapiens 


Similar to RIKEN cDNA 2600001A1 1 gene 


2062 


100 


422 


gi4808241 


Homo 
sapiens 


dJ466N1.2 (glycine C-acetyltransferase (2- 
amino-3-ketobutyrate coenzyme A ligase)) 


853 


89 


422 


gi3342906 


Homo 
sapiens 


2-amiiK>-3-ketobutyTate-CoA ligase 


853 


89 


423 


AAB65162 


Homo 
sapiens 


GETH Human PRO290 (UNQ253) protein 
sequence SEQ ID NO:33. 


1972 


100 


423 


AAY66639 


Homo 
sapiens 


GETH Membrane-bound protein PRO290. 


1972 


100 


423 


AAB24058 


Homo 
sapiens 


GETH Human PRO290 protein sequence 
SEQ ID NO:7. 


1972 


100 


424 


gil67835 


Dictyosteliu 
m 

discoideum 


myosin heavy chain 


142 


24 


424 


gi2983243 


Aquifex 
aeolicus 


chromosome assembly protein homolog 


140 


20 


424 


AAB95546 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO:18167. 


132 


25 


425 


AAB43587 


Homo 
sapiens 


HUMA- Human cancer associated protein 
sequence SEQ ID NO: 1032. 


427 


100 


425 


AAM52659 


Homo 
sapiens 


BIOW- Human phosphatase 9. 


423 


98 


425 


AAG00658 


Homo 
sapiens 


GEST Human secreted protein, SEQ ID NO: 
4739. 


360 


97 


426 


gil3325388 


Homo 
sapiens 


Similar to RIKEN cDNA 1 1 10007C09 gene 


821 


88 


426 


ABB89804 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2180. 


814 


87 


426 


AAG73935 


Homo 
sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:4699. 


299 


95 


427 


AAB93249 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 12263. 


731 


49 


427 


AAB 18977 


Homo 
sapiens 


INCY- Amino acid sequence of a human 
transmembrane protein. 


615 


89 


427 


AAE01518 


Homo 
sapiens 


HUMA- Human gene 2 encoded secreted 
protein fragment, SEQ ID NO: 175. 


495 


98 


428 


AAB 18977 


Homo 
sapiens 


INCY- Amino acid sequence of a human 
transmembrane protein. 


1008 


100 


428 


AAB93249 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 12263. 


756 


43 


428 


AAY00276 


Homo 
sapiens 


HUMA- Human secreted protein encoded 
by gene 19. 


603 


100 




gl/044J lo 


Mesocricetu 
sauratus 


casein kinase I epsilon; CKI epsilon 


I j04 


yy 


430 


gi 131 22442 


Rattus 
norvegicus 


casein kinase 1 epsilon-2 


1564 


99 


430 


gi9650968 


Rattus 
norvegicus 


casein kinase 1 epsilon-3 


1564 


99 


431 


gi2642187 


Rattus 
norvegicus 


endo-alpha-D-mannosidase 


1973 


87 


431 


AAB95204 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 17303. 


1559 


99 
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431 


AAE04255 


Homo 
sapiens 


HUMA- Human gene 4 encoded secreted 
protein fragment, SEQ ID NO: 1 1 6. 


1408 


98 


432 


ABB05662 


Homo 
sapiens 


GEHU- Human signal transduction protein 
clone amy2 10hl7. 


139 


36 


432 


AAU16313 


Homo 
sapiens 


HUMA- Human novel secreted protein, Seq 
ID 1266. 


139 


36 


432 


gi2 1040537 


Homo 
sapiens 


Similar to RIKEN cDNA 9130020G10 gene 


132 


35 


433 


AAG89209 


Homo 
sapiens 


GEST Human secreted protein, SEQ ID NO: 
329. 


460 


97 


433 


gil890812 


Flexamia 
graminea 


NADH dehydrogenase 1 


71 


24 


433 


gi|2 1295981| 
gb|EAA08 1 
26.1| 


Anopheles 
gambiae str. 
PEST 


agCP1281 


73 


28 


434 


AAY91533 


Homo 
sapiens 


HUMA- Human secreted protein sequence 
encoded by gene 83 SEQ ID NO:206. 


1159 


100 


434 


gi2150013 


Homo 
sapiens 


transmembrane protein 


1159 


100 


434 


gil2803197 


Homo 
sapiens 


claudin 5 (transmembrane protein deleted in 
vclocardiofacial syndrome) 


1159 


100 


435 


AAE06609 


Homo 
sapiens 


SAGA Human protein having hydrophobic 
domain, HP 10800. 


498 


42 


435 


ABB89766 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2142. 


497 


42 


435 


AAB93645 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO:13146. 


497 


42 


436 


gill 640570 


Homo 
sapiens 


MSTP031 


777 


100 


436 


ABB50826 


Homo 
sapiens 


HUMA- Human secreted protein encoded 
by gene 77 SEQ ID NO:779. 


75 


40 


436 


gil5291231 


Drosophila 

melanogaste 

r 


GH13214p 


72 


25 


437 


AAG73464 


Homo 
sapiens 


HUMA- Human gene 7-encoded secreted 
protein fragment, SEQ ID NO:239. 


2264 


98 


437 


AAG73462 


Homo 
sapiens 


HUMA- Human gene 7-encoded secreted 
protein fragment, SEQ ID NO:237. 


1897 


100 


437 


AAG73463 


Homo 
sapiens 


HUMA- Human gene 7-encoded secreted 
protein fragment, SEQ ID NO:238. 


1878 


98 


438 


gi9886738 


Homo 
sapiens 


junctophilin type3 


3916 


99 


438 


gi9927307 


Mus 

musculus 


junctophilin type 3 


3551 


90 


438 


gi9886757 


Homo 
sapiens 


junctophilin type3 


3172 


100 


439 


ABB89241 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
1617. 


739 


96 


439 


gi 18762530 


Danio rerio 


envelope protein 


380 


47 


439 


AAB08894 


Homo 
sapiens 


HUMA- Human secreted protein sequence 
encoded by gene 4 SEQ ID NO:5 1 . 


240 


64 


440 


AAB43484 


Homo 
sapiens 


HUMA- Human cancer associated protein 
sequence SEQ ID NO:929. 


761 


100 


440 


gi 10834676 


Homo 
sapiens 


PP3856 


673 


99 
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440 


gi21428806 


Drosophila 

melanogaste 

r 


GH04243p 


636 


49 


441 


AAB43484 


Homo 
sapiens 


HUM A- Human cancer associated protein 
sequence SEQ ID NO:929. 


761 


100 


441 


gi2 1428806 


Drosophila 

melanogaste 

r 


GH04243p 


636 


49 


441 


gi!4247685 


Staphylococ 
cus aureus 
subsp. 
aureus 
Mu50 


nicotinate phosphor ibosy I transferase 
homolog 


544 


34 


442 


AAB43484 


Homo 
sapiens 


HUM A- Human cancer associated protein 
sequence SEQ ID NO:929. 


761 


100 


442 


gi2 1428806 


Drosophila 

melanogaste 

r 


GH04243p 


636 


49 


442 


gil0834676 


Homo 
sapiens 


PP3856 


582 


89 


443 


ABB11177 


Homo 
sapiens 


H YSE- Human phosphatidate 
phosphohydrolase homologue, SEQ ID 
NO: 1547. 


952 


98 


443 


AAG89279 


Homo 
sapiens 


GEST Human secreted protein, SEQ ID NO: 
399. 


641 


66 


443 


AAB70690 


Homo 
sapiens 


SREN- Human hDPP protein sequence SEQ 
ID NO:7. 


639 


65 


444 


AAM40391 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
3536. 


672 


48 ! 


444 


AAM42177 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
7108. 


567 


49 


444 


ABB90382 


Homo 
sapiens 


HUM A- Human polypeptide SEQ ID NO 
2758. 


559 


42 


445 


gi!9354040 


Mus 

musculus 


Similar to RIKEN cDNA 1810038N08 gene 


853 


95 


445 


gi 1403547 


Saccharomy 
ces 

cerevisiae 


P2558 protein 


175 


26 












445 


AAE15269 


Homo 
sapiens 


INCY- Human RNA metabolism protein-32 
(RMEP-32). 


78 


28 


446 


gil5157363 


Agrobacteri 
urn 

tumefaciens 
str. C58 
(Cereon) 


AGR_C_4025p 


256 


31 












A AH 

44 0 


gi 15075308 


Sinorhizobiu 
m meliloti 


CONSERVED HYPOTHETICAL 
PROTEIN 


243 


31 


446 


gi21324924 


Corynebacte 
rium 

glutamicum 

ATCC 

13032 


Uncharacterized ACR 


192 


28 


447 


gi20069113 


Homo 
sapiens 


corneal endothelium specific protein 1 


1201 


100 


447 


gi 12584947 


Homo 


ovary-specific acidic protein 


1195 


100 
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sapiens 








447 


gi 152 14757 


Mus 

musculus 


Similar to RIKEN cDNA 4930583H14 gene 


558 


50 


448 


AAT92305_ 
aal 


Homo 
sapiens 


SALIC Constitutively active receptor-alpha 
encoding cDNA. 


1686 


94 


448 


AAG63170 


Homo 
sapiens 


TULA- Amino acid sequence of human 
CAR-a polypeptide. 


1686 


94 


448 


AAW93902 


Homo 
sapiens 


GEHO Human CAR receptor protein. 


1686 


94 


449 


gil8182375 


Bos taums 


photoreceptor cadherin 


2693 


86 


449 


gil4625447 


Rattus 
norvegicus 


MT-protocadherin 


2563 


83 


449 


gil8182377 


Mus 

musculus 


photoreceptor cadherin 


2561 


83 


450 


AAM39421 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
2566. 


126 


27 


450 


gi 18676458 


Homo 
sapiens 


FLJ00 126 protein 


126 


27 


450 


gil7861384 


Homo 
sapiens 


nesprin-2 gamma 


126 


27 


451 


gill 967375 


Rattus 
norvegicus 


Dvl-binding protein Idax 


1062 


100 


451 


gi 11967377 


Homo 
sapiens 


Dvl-binding protein IDAX 


1062 


100 


451 


ABB16307 


Homo 
sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 4964. 


1006 


100 


452 


gi20073201 


Homo 
sapiens 


Similar to Olg-1 bHLH protein 


1301 


100 


452 


gi4929538 


Rattus 
norvegicus 


Olg-1 bHLH protein 


1086 


87 


452 


gi7385152 


Mus 

musculus 


oligodendrocyte-specific bHLH 
transcription factor Oligl 


1069 


86 


453 


AAM68085 


Homo 
sapiens 


MOLE- Human bone marrow expressed 
probe encoded protein SEQ ID NO: 28391. 


6900 


99 


453 


AAM55707 


Homo 
sapiens 


MOLE- Human brain expressed single exon 
probe encoded protein SEQ ID NO: 27812. 


6900 


99 


453 


gil 81 46660 


Homo 
sapiens 


DPCR1 


1206 


100 


454 


AAG75611 


Homo 
sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:6375. 


1759 


89 


454 


AAY 13942 


Homo 
sapiens 


SAGA Human transmembrane protein, 
HP01737. 


1759 


89 


454 


gil 5559308 


Homo 
sapiens 


Similar to serologically defined breast 
cancer antigen 84 


1759 


89 


455 


gil 5430296 


Mus 

musculus 


heart alpha-kinase 


100 


24 


455 


gi602255 


Rattus 
norvegicus 


protein tyrosine phosphatase 2E 


99 


22 


455 


gi2425111 


Dictyostetiu 
m 

discoideum 


ZipA 


94 


20 


456 


AAB58236 


Homo 
sapiens 


ROSE/ Lung cancer associated polypeptide 
sequence SEQ ID 574. 


283 


88 


457 


gi5420183 


Homo 
sapiens 


dJ377H14.9 (major histocompatibility 
complex, class I, F (CDA12)) 


611 


96 
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457 


AAG64617 


Homo 
sapiens 


KJMU/ Human cancer cell specific HLA-F 
antigen SEQ ID 4. 


603 


95 


457 


ABB50296 


Homo 
sapiens 


USSH HLA-Cw ovarian tumour marker 
protein, SEQ ID NO:82. 


603 


95 


458 


AAE18015 


Homo 
sapiens 


CURA- Human G-protein coupled receptor- 
3 (GPCR-3) protein. 


1116 


97 


458 


AAU24535 


Homo 
sapiens 


SENO- Human olfactory receptor 
AOLFR20. 


1116 


97 


458 


AAG71945 


Homo 
sapiens 


YEDA Human olfactory receptor 
polypeptide, SEQ ID NO: 1626. 


1106 


96 


459 


AAE02638 


Homo 
sapiens 


SCHE Human dendritic cell specific 
transmembrane protein (DC-STAMP). 


2448 


100 


459 


gil 1612079 


Homo 
sapiens 


DC-specific transmembrane protein 


2448 


100 


459 


AAB87357 


Homo 
sapiens 


HUM A- Human gene 16 encoded secreted 
protein HMADJ14, SEQ ID NO:98. 


1798 


99 


460 


ABB89120 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
1496. 


403 


87 


460 


gi!7742567 


dipeptide 


ABC transporter, membrane spanning 
protein [Agrobacterium tumefaciens str. 
C58(U. 


71 


29 


460 


gil5159154 


Agrobacteri 
urn 

tumefaciens 
str. C58 
(Cereon) 


AGR_LJ477p 


71 


29 


461 


AAG73470 


Homo 
sapiens 


HUMA- Human gene 14-encoded secreted 
protein fragment, SEQ ID NO:245. 


699 


100 


461 


ABB90038 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2414. 


486 


53 


461 


AAB95779 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 18726. 


486 


53 


462 


gi7021367 


Drosophila 

melanogaste 

r 


cll.l 


511 


25 


462 


gil 7862452 


Drosophila 

melanogaste 

r 


LD28902p 


511 


25 


462 


gil2724134 


Lactococcus 
lactis subsp. 
lactis 


HYPOTHETICAL PROTEIN 


81 


33 


463 


AAM42407 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
140. 


606 


100 


463 


AAM95921 


Homo 
sapiens 


HUMA- Human reproductive system related 
antigen SEQ ID NO: 4579. 


606 


100 


463 


gi7322066 


Drosophila 
sp. 


His 


335 


27 


464 


gi18147612 


Homo 
sapiens 


metal loprotease disintegrin 


4206 


100 


464 


AAB47106 


Homo 
sapiens 


ZYMO Second splice variant of MAPP. 


4190 


99 


464 


gil3157560 


Homo 
sapiens 


dJ964F7. 1 (novel disintegrin and reprolysin 
metalloproteinase family protein) 


4104 


100 


465 


gi!4091952 


Rattus 
norvegicus 


KJDINS220 


294 


26 



WO 03/025148 



PCT/US02/29964 



171 

Table 2B 



SEQ 
ID 


Hit ID 


Species 


Description 


S 

score 


Percent 
identity 


465 


gil 1321435 


Ratals 
norvegicus 


ankyrin repeat-rich membrane-spanning 
protein 


292 


26 


465 


AAM39025 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
2170. 


288 


27 


466 


gi 16648368 


Drosophila 

melanogaste 

r 


LD35341p 


177 


49 


466 


gi!9744967 


Dictyosteliu 
m 

discoideum 


80 kda MCM3-associated protein 


153 


22 


466 


gi4995703 


Mus 

musculus 


GANP protein 


141 


25 


467 


gil 2002028 


Homo 
sapiens 


brain my040 protein 


482 


100 


467 


gi|20453865| 
gb|AAM221 
67.1|AF482 
520 I 


Utricularia 
geminiscapa 


cytochrome C oxidase subunit I 


67 


48 


467 


gi|20453861| 
gb|AAM221 
65.1|AF482 
518 1 


Utricularia 
adpressa 


cytochrome C oxidase subunit 1 


67 


48 


468 


AAY94938 


Homo 
sapiens 


GEMY Human secreted protein clone 
ye78 1 protein sequence SEQ ID NO:82. 


2288 


97 


468 


AAG81379 


Homo 
sapiens 


ZYMO Human AFP protein sequence SEQ 
IDNO:276. 


1701 


99 


468 


AAG81387 


Homo 
sapiens 


ZYMO Human AFP protein sequence SEQ 
IDNO:292. 


1570 


99 


469 


AAY27721 


Homo 
sapiens 


HUM A- Human secreted protein encoded 
by gene No. 29. 


1114 


98 


469 


AAB87068 


Homo 
sapiens 


MILL- Human secreted protein TANGO 
365, SEQIDNO:46. 


621 


99 


469 | 


a a «■ A«f 4 Art 

AAB87148 


Homo 
sapiens 


MILL- Human secreted protein TANGO 
365 T20S variant, SEQ ID NO: 165. 


617 


98 


470 


gil2 140288 


Homo 
sapiens 


bA 1 2M 1 9. 1 .3 (novel protein) 


2537 


100 


470 


gi 12 140289 


Homo 
sapiens 


bAl 2M 1 9. 1 . 1 (novel protein) 


2203 


88 


470 


AAE03639 


Homo 
sapiens 


INCY- Human extracellular matrix and cell 
adhesion molecule-3 (XMAD-3). 


2114 


88 


471 


AAR90766 


Homo 
sapiens 


USSH Tumour suppressor protein HTS- 1 . 


1502 


70 


A*J 1 

471 


gi257387 


Homo 
sapiens 


HTS1 


1502 


70 


471 


gl 1 769472 


Homo 


p82 


1502 


70 


472 


gil9684136 


Homo 
sapiens 


Similar to RIKJEN cDNA 4933413N12 gene 


645 


100 


472 


gi559500 


Caenorhabdi 
tis elcgans 


ND2 protein (AA 1 -282) 


75 


35 


472 


gi6687124 


Convolvulus 
arvensis 


NADH dehydrogenase subunit F 


72 


30 


473 


gil9684136 


Homo 
sapiens 


Similar to RIKEN cDNA 4933413N12 gene 


972 


100 


473 


gi2258350 


Reclinomon 


SecY-type transporter protein 


78 


24 



WO 03/025148 



PCT/US02/29964 



172 

Table 2B 



SEQ 
ID 


Hit ID 


Species 


Description 


S 

score 


Percent 
identity 






as americana 








473 


gi559500 


Caenorhabdi 
tis elegans 


ND2 protein (AA 1 - 282) 


76 


29 


474 


gi32474 


Homo 
sapiens 


h-Sp! 


1250 


93 


474 


gi632790 


Homo 
sapiens 


pantophysin 


1250 


93 


474 


gil6877127 


Homo 
sapiens 


Similar to synaptophysin-like protein 


1161 


92 


475 


AAB36613 


Homo 
sapiens 


INCY- Human FLEXHT-35 protein 
sequence SEQ ID NO:35. 


1304 


88 


475 


gi!4603247 


Homo 
sapiens 


Similar to RDCEN cDNA 5730409G15 gene 


1304 


88 


475 


AAB93042 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 11827. 


240 


90 


476 


gi5052674 


Drosophila 

melanogaste 

r 


BcDNA.LD29892 


349 


24 


476 


gi 16768704 


Drosophila 

melanogaste 

r 


HL04910p 


329 


24 


476 


gil 7945748 


Drosophila 

melanogaste 

r 


RE32936p 


277 


22 


477 


AAG71509 


Homo 
sapiens 


YEDA Human olfactory receptor 
polypeptide, SEQ ID NO: 1 190. 


1510 


96 


477 


gi2792016 


Homo 
sapiens 


olfactory receptor 


1388 


99 


477 


gi4092819 


Homo 
sapiens 


BC319430 5 


1381 


99 


478 


AAY73483 


Homo 
sapiens 


GEMY Human secreted protein clone 
yll 8 1 protein sequence SEQ ID NO: 1 88. 


579 


47 


478 


AAM92890 


Homo 
sapiens 


HUMA- Human digestive system antigen 
SEQ ID NO: 2239. 


384 


52 


478 


AAU83621 


Homo 
sapiens 


GETH Human PRO protein, Seq ID No 60. 


333 


28 


479 


AAM93439 


Homo 
sapiens 


HELI- Human polypeptide, SEQ ID NO: 
3078. 


1182 


94 


479 


gil 5079907 


Homo 
sapiens 


Similar to secretory carrier membrane 
protein 4 


1182 


94 


479 


ABB06156 


Homo 
sapiens 


COMP- Human NS protein sequence SEQ 
ID NO:248. 


1020 


83 


480 


gi 1497861 


fowl 

adenovirus 
8] [Fowl 
adenovirus 8 


fiber 


81 


24 


480 


gi6572647 


fowl 

adenovirus 8 


short fiber homolog [Fowl 


81 


24 


480 


gi3808227 


Sphaeropsis 
sapinea 
RNA virus 2 


coat protein 


79 


32 


481 


gi!3517508 


Homo 
sapiens 


sodium bicarbonate cotransporter 


5138 


100 


481 


gi 14582760 


Homo 
sapiens 


anion exchanger AE4 


4979 


97 
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481 


gi7363254 


Homo 
sapiens 


sodium bicarbonate cotransporter 5 


4973 


97 


482 


AAM50714 


Homo 
sapiens 


MILL- Human TRP-like calcium channel-4 
(TLCC-4). 


2810 


99 


482 


gi2 1435923 


Homo 
sapiens 


cation channel TRPV3 


2810 


99 


482 


gi20908451 


Mus 

musculus 


TRP ion channel TRPV3 


2665 


94 


483 


AAB86365 


Homo 
sapiens 


MEMO- Human ceramidase K3 protein. 


1069 


76 


483 


gil 7529684 


Mus 

musculus 


cancer related gene-liver 1 


1020 


70 


483 


gi 18028 135 


Drosophila 

melanogaste 

r 


brain washing 


442 


36 


484 


ABB89360 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
1736. 


251 


78 


484 


gil 574439 


Haemophilu 
s influenzae 
Rd 


leucine responsive regulatory protein (hp) 


73 


38 


484 


gil2720483 


Pasteurella 
multocida 


Lip 


73 


38 


485 


AAY99347 


Homo 
sapiens 


GETH Human PROl 1 13 (UNQ556) amino 
aacid sequence SEQ ID NO:24. 


2250 


99 


485 


gil 5987499 


Mus 

musculus 


tumor endothelial marker 5 precursor 


1863 


48 


485 


AAU74824 


Homo 
sapiens 


INCY- Human REPTR 7 protein. 


1812 


47 


486 


AAS12581_ 
aal 


Homo 
sapiens 


PEKE cDNA encoding novel human G 
protein-coupled receptor (GPCR). 


1853 


100 


486 


AAS07946_ 
aal 


Homo 
sapiens 


AREN- Human cDNA encoding G-protein 
coupled receptor, hRUP19. 


1853 


100 


486 


AAD27497_ 
aal 


Homo 
sapiens 


EURO- Human G-protein coupled receptor 
(GPCRxl4)DNA. 


1853 


100 


487 


gi4959568 


Homo 
sapiens 


nuclear pore complex interacting protein 
NPIP 


1087 


67 


487 


ABB90262 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2638. 


852 


71 


487 


gil 4603481 


Homo 
sapiens 


Similar to nuclear pore complex interacting 
protein 


644 


82 


488 


AAM25630 


Homo 
sapiens 


HYSE- Human protein sequence SEQ ID 
NO: 1145. 


554 


90 


488 


AAG63804 


Homo 
sapiens 


N1SC- Amino acid sequence of a human 
amino acid transporter. 


551 


98 


488 


gi9309293 


Homo 
sapiens 


asc-type amino acid transporter 1 


551 


98 


489 


AAM39751 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
2896. 


2304 


99 


489 


AAM41538 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
6469. 


2294 


99 


489 


AAM41537 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
6468. 


2294 


99 


490 


AAE06056 


Homo 
sapiens 


HUMA- Human gene 16 encoded secreted 
protein HM1AP86, SEQ ID NO:l 18. 


1006 


75 


490 


AAY87079 


Homo 


HUMA- Human secreted protein sequence 


1006 


75 
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sapiens 


SEQIDNO:118. 






490 


AA Y785 1 1 


Homo 
sapiens 


A\\M\n T T 1 • . A /I I/^T> 

AMYL- Human uncoupling protein 4 (UCP- 
4) amino acid sequence. 


lOOo 


75 


491 


AAG71803 


Homo 
sapiens 


YEDA Human olfactory receptor 
polypeptide, SEQ ID NO: 1484. 


1616 


1 AA 
100 


491 


ABB 0662 5 


Homo 
sapiens 


CURA- G protein-coupled receptor 
GPCR13 protein SEQ ID NO:60. 


1 £. AO 

1608 


AA 

99 


491 


ABB06626 


Homo 
sapiens 


CURA- G protein-coupled receptor 
GPCR13b protein SEQ ID NO:62. 


1605 


99 


492 


■ \ f\ A A f\ A /"* O 

gi 10440458 


Homo 
sapiens 


FU00065 protein 


992 


100 


492 


gi 15545993 


Homo 
sapiens 


Bcl-2 modifying factor 


992 


100 


492 


gi 15545991 


Mus 

muscuhis 


Bcl-2 modifying factor 


864 


87 


493 


AAG67525 


Homo 
sapiens 


SM1K Amino acid sequence of a human 
secreted polypeptide. 


1841 


99 


493 


ABB90207 


Homo 
sapiens 


HUM A- Human polypeptide SEQ ID NO 
2583. 


557 


38 


493 


AAB69185 


Homo 
sapiens 


SREN- Human hlSLR-iso protein SEQ ID 

NO:7. 


557 


38 


494 


ABB05727 


Homo 
sapiens 


GEHU- Human signal transduction protein 
clone tes3 5k22. 


111 


46 


494 


AAB 12529 


Homo 
sapiens 


SLOK Human Ma5 protein SEQ ID NO: 13. 


111 


46 


494 


gi61 79740 


Homo 
sapiens 


paraneoplastic neuronal antigen MA3 


111 


46 


495 


gi 17862902 


Drosophila 

melanogaste 

r 


SD02518p 


845 


43 


495 


gil7861532 


Drosophila 

melanogaste 

r 


GH1 1618p 


833 


42 


495 


gi530088 


Glycine max 


aminoalcoholphosphotransferase 


398 


28 


496 


"AA^*1 O CI 

gi9963853 


Homo 
sapiens 


HT018 


1368 


100 


497 


inn (\ArtT) 

ABB 90073 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2449. 


1286 


70 


497 


a a. r% tun 

AAB 121 23 


Homo 
sapiens 


PROT- Hydrophobic domain protein from 
clone HP 10608 isolated from Saos-2 cells. 


1286 


70 


A A"7 

497 


gi 1324 1761 


Homo 
sapiens 


transmembrane protein induced by tumor 
necrosis factor alpha 


1286 


70 


495 


AddoMJUI 


Homo 
sapiens 


GETH Human PR02863 1 protein sequence 
SEQ ID NO:370. 


tit 
131 


27 


498 


AA Yoo234 


Homo 


HUMA- Human secreted protein 
HNTNP70 SFOIDNO-14Q 


123 


38 


498 


AAB65258 


Homo 
sapiens 


GETH Human PROl 153 (UNQ583) protein 
sequence SEQ ID NO:35 1 . 


111 


54 


499 


AAB93704 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO:13287. 


3677 


99 


499 


ABB07504 


Homo 
sapiens 


INCY- Human GTP-binding protein 
(GTPB) (ID: 4028409CD1). 


2960 


57 


499 


ABB07686 


Homo 
sapiens 


MERE Human GTPase-like protein, MFQ- 
111. 


2456 


56 


500 


gi21212948 


Mus 


peroxisomal protein (PeP) 


462 


53 
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musculus 








500 


gi3 10897 


Thermobifid 
a ftisca 


beta-l,4-endoglucanase precursor 


124 


35 


500 


gi485747 


Gailus 
gallus 


protein-tyrosine phosphatase 


115 


32 


501 


AAB35156 


Homo 
sapiens 


SMIK Human nuclear receptor NOT la 
splice variant related protein. 


2750 


88 


501 


AAU09156 


Homo 
sapiens 


SMIK Human NOT1 orphan nuclear 
receptor. 


2750 


88 


501 


AAR48631 


Homo 
sapiens 


MAGE/ Sequence of nuclear receptor of T- 
cells (NPT) steroidreceptor protein. 


2750 


88 


502 


AAU11383 


Homo 
sapiens 


SENO- Human T2R55 (hT2R55) 
polypeptide. 


1632 


98 


502 


gi20336515 


Homo 
sapiens 


candidate taste receptor T2RP24 


1632 


98 


502 


AAU11382 


Homo 
sapiens 


SENO- Human T2R54 (hT2R54) 
polypeptide. 


894 


57 


503 


AAB92909 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 11539. 


3006 


98 


503 


gil7862912 


Drosophila 

melanogaste 

r 


SD02996p 


1037 


31 


503 


ABB90736 


Homo 

sapiens 


UYJO Human Tumour Endothelial Marker 
polypeptide SEQ ID NO 204. 


410 


24 


504 


ABB05730 


Homo 
sapiens 


ZYMO Human zcytorl 7 protein sequence 
SEQ ID NO:2. 


3070 


99 


504 


gi20563277 


Homo 
sapiens 


gpl30-like monocyte receptor 


3070 


99 


504 


ABB05741 


Homo 
sapiens 


ZYMO Human zcytorl 7 protein sequence 
SEQ ID NO:54. 


3066 


99 


505 


AAU80509 


Homo 
sapiens 


INCY- Human G-coupled receptor 
(GCREC) protein, Seq ID No 17. 


1781 


100 


505 


AAU11885 j 


Homo 
sapiens 


CURA- Human novel G protein-coupled 
receptor, GPCRla. 


1595 


100 


505 


AAU 11886 | 


Homo 
sapiens 


CURA- Human novel G protein-coupled 
receptor, GPCRlb. 


1589 


99 


506 


gi4102877 


Mus 

musculus 


She binding protein 


2283 


69 


506 


gi!2017952 


Homo 
sapiens 


GE36 


464 


30 


506 


gi20906085 


Metbanosarc 
ina mazei 
Goel 


surface layer protein B 


128 


23 


507 


AAB11699 


Homo 
sapiens 


FUSO Human serine protease BSSP2 
(hBSSP2), SEQ ID NO: 10. 


1404 


100 


507 


gi 122489 17 


Homo 
sapiens 


spines in 


1404 


100 


507 


AAE14342 


Homo 
sapiens 


INCY- Human protease PRTS-7 protein. 


1236 


99 


508 


gi 18032273 


Mus 

musculus 


VPS 10 domain receptor SorCSlc splice 
variant 


5198 


96 


508 


gil8032275 


Homo 
sapiens 


VPS 10 domain receptor SorCS 


5121 


99 


508 


gi7715916 


Mus 

musculus 


SorCSb splice variant of the VPS 10 domain 
receptor SorCS 


4963 


96 
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509 


gil4278927 


Mus 

muse ul us 


gliacolin 


1291 


94 


509 


gi!0566471 


Mus 

musculus 


Gliacolin 


1291 


94 


509 


gi3747097 


Homo 
sapiens 


Clq-related factor 


976 


70 


510 


gi 12247892 


Sterkiella 

histriomusco 

mm 


SPEC3-like protein 


90 


31 


510 


A A A AAAAO 

AAA99908_ 
aal 


Homo 
sapiens 


GETH cDNA encoding human protein 
PR032I. 


71 


30 


510 


ABB84833 


Homo 
sapiens 


GETH Human PR0321 protein sequence 
SEQ1DN0:34. 


71 


30 


511 


ABB90246 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2622. 


648 


100 


511 


AAB25755 


Homo 
sapiens 


HUMA- Human secreted protein sequence 
encoded by gene 33 SEQ ID NO: 144. 


648 


100 


511 


a a none j 

AAB25754 


Homo 
sapiens 


¥ FT FA A A WW * % a * 

HUMA- Human secreted protein sequence 
encoded by gene 33 SEQ ID NO: 143. 


301 


100 


512 


gi 138 10306 


Homo 
sapiens 


transmembrane protein 7 


1271 


100 


512 


• « a"/\T* A 

gil 8250724 


Mus 

musculus 


transmembrane protein 7 


639 


64 


512 


gi 1534 1942 


Homo 
sapiens 


28kD mterferon responsive protein 


428 


38 


513 


AAG72504 


Homo 
sapiens 


YEDA Human OR-like polypeptide query 
sequence, SEQ ID NO: 2185. 


1615 


99 


513 


AAU24651 


Homo 
sapiens 


SENO- Human olfactory receptor 
AOLFR147. 


1615 


99 


513 


AAG71709 


Homo 
sapiens 


YEDA Human olfactory receptor 
polypeptide, SEQ ID NO: 1390. 


1611 


99 


514 


gi20381 191 


Homo 
sapiens 


Similar to RIKEN cDNA 4932443L08 gene 


2831 


99 


514 


A A T* o "> Aha 

AAB83079 


Homo 
sapiens 


SM1K Human CASB641 1 protein. 


1806 


100 






Homo 
sapiens 


INCY- A human leukocyte and blood 
related protein (LBAP). 


1424 


100 


515 


gi20072886 


Homo 
sapiens 


Similar to RIKEN cDNA 2610024A01 gene 


1456 


100 


515 


AAJB74716 


Homo 
sapiens 


INCY- Human membrane associated protein 
MEMAP-22. 


1094 


99 


51 j 


Add 55/324 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
1900. 


513 


98 


516 


A A Z^iCiCI >4 1 

AAUOOI41 


Homo 
sapiens 


MILL- Human LGR6 polypeptide (clone 
Fbhl50881). 


3804 


99 




/VrVVJUO I HU 


Homo 
sapiens 


jviibi>- numan iajko polypeptide ( clone 
fehr). 


35U4 


go 


516 


gil 044 1732 


Homo 
sapiens 


leucine-rich repeat-containing G protein- 
coupled receptor 6 


3782 


100 


517 


AAB24465 


Homo 
sapiens 


HUMA- Human secreted protein sequence 
encoded by gene 29 SEQ ID NO:90. 


447 


98 


518 


AAM40227 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
3372. 


909 


34 


518 


gi21321124 


Rattus 
norvegicus 


proton-associated sugar transporter A | 


898 


34 
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(in 

518 


gi4ooUZ29 


Homo 
sapiens 


DNb-5 


537 


29 


51V 


ABBU7253 


Tt,,,, 
Homo 

sapiens 


LEXI- Human novel GPCK (NOrCR) 
protein. 


3943 


99 


519 


AAMoyoOV 


Homo 
sapiens 


MOLE- Human bone marrow expressed 
probe encoded protein SEQ ID NO: 29913. 


1 770 


82 


519 


AAM57201 


Homo 
sapiens 


MOLE- Human brain expressed single exon 
probe encoded protein SEQ ID NO: 29306. 


1770 


82 


520 


AAM43601 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
279. 


1229 


99 


520 


A ATT* A 

AAU18290 


Homo 
sapiens 


HUMA- Human endocruie polypeptide SEQ 
ID No 245. 


1228 


99 


520 


AAY27577 


Homo 
sapiens 


HUMA- Human secreted protein encoded 
by gene No. 1 1 . 


598 


100 


521 


A A T*> A A *^ r\ A 

AAB94304 


Homo 

sapiens 


HELI- Human protein sequence SEQ ID 
NO: 14767. 


1523 


100 


521 


AAD23974_ 
aal 


Homo 
sapiens 


INCY- Human neurotransmitter transporter, 
NTT-2 cDNA. 


1350 


92 


521 


AAE 14404 


Homo 
sapiens 


INCY- Human neurotransmitter transporter, 
NTT-2. 


1350 


92 


522 


AAB74730 


Homo 
sapiens 


INCY- Human membrane associated protein 
MEMAP-36. 


637 


37 


522 


AAY94906 


Homo 
sapiens 


GEMY Human secreted protein clone 
rb649 3 protein sequence SEQ ID NO: 18. 


637 


37 


522 


AAM40237 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
3382. 


523 


37 


523 


AAB43665 


Homo 
sapiens 


HUMA- Human cancer associated protein 
sequence SEQ ID NO: 1110. 


1254 


100 


523 


AAY19759 


Homo 
sapiens 


HUMA- SEQ ID NO 477 from 
W09922243. 


966 


100 


523 


gi21428606 


Drosophila 

melanogaste 

r 


LD47425p 


939 


70 


524 


AAH42183_ 
aa2 


Homo 
sapiens 


PHAA Nucleotide sequence of a G-protem 
coupled receptor. 


1925 


94 


524 


ABB06303 


Homo 
sapiens 


TAKE Human ZAQ protein sequence SEQ 
ID NO:l. 


1 925 


94 


524 


AAB70143 


Homo 
sapiens 


TAKE Human G protein-coupled receptor 
protein. 


1 925 


94 


525 


AAB93258 


Homo 
sapiens 


HELI- Human protein sequence SEQ ID 
NO: 12282. 


930 


53 


525 


AAY288I0 


Homo 
sapiens 


GEMY nn296_2 secreted protein. 


930 


53 


525 


gil 7944467 


Drosophila 

melanogaste 
j 


RH03777p 


749 


48 


526 


AAM48989 


Homo 
sapiens 


TAKE Human testis originated G -protein 
coupled receptor TGR10. 


106I 


97 


526 


gil 3876663 


lumpy skin 
disease virus 


G-protein-coupled chemokine receptor-like 
protein 


I9l 


25 


526 


gi7108517 


Oryctolagus 
cuniculus 


chemokine receptor 


190 


29 


527 


gil2214288 


Homo 
sapiens 


dJ402H5.2 (novel protein similar to worm 
and fly proteins) 


2655 


100 


527 


gi3880799 


Caenorhabdi 


Y39A1B.2 i 


431 


23 
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tis elegans 








527 


gi 157 18594 


Caenorhabdi 
tis elegans 


C. elegans PTR-10 protein (corresponding 
sequence F55F8.1) 


430 


23 ! 


528 


ABB89636 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
2012. 


817 


100 


528 


gi2 1483396 


Drosophila 

melanogaste 

r 


LD22376p 


813 


40 


528 


gi 18480372 


Mus 

musculus 


olfactory receptor MORI 45-3 


82 


25 


529 


AAM50125 


Homo 
sapiens 


MILL- Human acyltransferase 46743. 


1874 


100 


529 


AAB65222 


Homo 
sapiens 


GETH Human PROl 108 (UNQ551) protein 
sequence SEQ ID NO:248. 


1583 


69 


529 


AAM00959 


Homo 
sapiens 


HYSE- Human bone marrow protein, SEQ 
ID NO: 435. 


1583 


69 


530 


ABB11531 


Homo 
sapiens 


HYSE- Human secreted protein homologue, 
SEQIDNO:1901. 


1290 


99 


530 


AAM25596 


Homo 
sapiens 


HYSE- Human protein sequence SEQ ED 
NO:llll. 


1289 


99 


530 


ABB55767 


Homo 
sapiens 


FECH/ Human polypeptide SEQ ID NO 
140. 


1282 


99 


531 


AAI66039_ 
aal 


Homo 
sapiens 


KYOW Human G protein-coupled receptor 
encoding cDNA SEQ ID NO 2. 


787 


100 


531 


AAA64346_ 
aal 


Homo 
sapiens 


MILL- DNA encoding a human G-protein 
coupled receptor designated 14273. 


787 


100 


53! 


AAE04564 


Homo 
sapiens 


INCY- Human G-protein coupled receptor- 
20 (GCREC-20) protein. 


787 


100 


532 


AAU 11888 


Homo 
sapiens 


CURA- Human novel G protein-coupled 
receptor, GPCR3a. 


1747 


99 


532 


AAU24662 


Homo 
sapiens 


SENO- Human olfactory receptor 
AOLFR160. 


1747 | 


99 


532 


AAU 11 889 


Homo 
sapiens 


CURA- Human novel G protein-coupled 
receptor, GPCR3b. 


1632 


98 


533 


gi557822 


Saccharomy 
ces 

cerevisiae 


mal5, stal, len: 1367, CAI: 0.3, 
AMYH YEAST P08640 
GLUCOAMYLASE SI (EC 3.2.1.3) 


314 


25 


533 


gi 1304387 


Saccharomy 
ces 

cerevisiae 
var. 

diastaticus 


glucoamylase 


314 


25 












533 


gi915208 


Sus scrofa 


gastric mucin 


307 


25 


534 


AAU00437 


Homo 
sapiens 


COUN- Human dendritic cell membrane 
protein FIRE. 


1997 


88 


534 


AAY91625 


Homo 
sapiens 


HUMA- Human secreted protein sequence 
encoded by gene 22 SEQ ID NO:298. 


1836 


96 


534 


gi 16930385 


Mus 

musculus 


seven-span membrane protein FIRE 


1445 


62 


535 


AAB61148 


Homo 
sapiens 


CURA- Human NOV 17 protein. 


2306 


59 


535 


gi 186764 16 


Homo 
sapiens 


FLJ00080 protein 


1900 


57 


535 


AAB61 147 


Homo 
sapiens 


CURA- Human NOV 16 protein. 


1378 


53 
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536 


AAB61148 


Homo 
sapiens 


CURA- Human NOV 1 7 protein. 


2306 


59 


536 


gi 186764 16 


Homo 
sapiens 


FLJ00080 protein 


1900 


57 


536 


AAB61147 


Homo 
sapiens 


CURA- Human NOV 16 protein. 


1378 


53 


537 


gil4325132 


Thermoplas 
ma 

volcanium 


tricorn protease 


75 


29 


537 


gi2 1064441 


Drosophila 

melanogaste 

r 


RE29777p 


74 


30 


537 


gi|13541726| 
ref]NP 1114 
!4.1| 


Thermoplas 
ma 

volcanium 


Tricorn protease 


75 


29 


538 


AAG71899 


Homo 
sapiens 


YEDA Human olfactory receptor 
polypeptide, SEQ ID NO: 1580. 


1603 


100 


538 


AAV A f A f% 

AAU24548 


Homo 
sapiens 


SENO- Human olfactory receptor 
AOLFR35. 


1603 


100 


538 


AAE06770 


Homo 
sapiens 


INCY- Human G-protein coupled receptor- 
20 (GCREC-20) protein. 


1598 


100 


539 


AAG81420 


Homo 
sapiens 


ZYMO Human AFP protein sequence SEQ 
ID NO:358. 


403 


98 


539 


AAM93259 


Homo 
sapiens 


HELI- Human polypeptide, SEQ ID NO: 
2709. 


327 


38 


539 


gil6877659 


Homo 
sapiens 


Similar to RIKEN cDNA 1810054013 gene 


314 


38 


540 


AAG89209 


Homo 
sapiens 


GEST Human secreted protein, SEQ ID NO: 
329. 


460 


97 


540 


gi) 890812 j 


Flexamia 
graminea 


NADH dehydrogenase 1 


71 


24 


540 


gi|21295981| 

gb|EAA081 

26.1| 


Anopheles 
gambiae str. 
PEST 


agCP1281 


73 


28 


541 


ABB89210 


Homo 
sapiens 


HUMA- Human polypeptide SEQ ID NO 
1586. 


851 


99 


C A \ 

541 


AAY73442 


Homo 
sapiens 


GEMY Human secreted protein clone 
ya66 1 protein sequence SEQ ID NO: 1 06. 


596 


95 


541 


AAB63255 


Homo 
sapiens 


LUDW- Human breast cancer associated 
antigen protein sequence SEQ ID NO:617. 


88 


40 


C A 1 

542 


gi9929918 


IT 

Homo 
sapiens 


intestinal mucin 


4024 


99 


542 


'1 | AAA^AO 

gil 1990203 


Homo 
sapiens 


MUC3B mucin 


3985 


98 


542 


gi9929920 


Homo 

com one 


intestinal mucin 


3908 


96 


543 


gil7483744 


Mus 

musculus 


RING finger protein 33 


1115 


47 


543 


gil4043332 


Homo 
sapiens 


Similar to ring finger protein 23 


913 


40 


543 


gil 07 16078 


Mus 

musculus 


testis-abundant finger protein 


907 


40 


544 


AAG76127 


Homo 
sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:689 1 . 


260 


68 


544 


AAG03891 


Homo 


GEST Human secreted protein, SEQ ID NO: 


260 


68 
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sapiens 


7972. 






544 


gi57131 


Rattus 
norvegicus 


ribosoma) protein S26 


260 


68 


545 


AAU74820 


Homo 
sapiens 


INCY- Human REPTR 3 protein. 


1737 


42 


545 


gi6683905 


Drosophila 

melanogaste 

r 


Dispatched 


1073 


31 


545 


AAU03497 


Homo 
sapiens 


UYZU- Human sterol sensing domain 
protein. 


885 


43 


546 


AAM78329 


Homo 
sapiens 


HYSE- Human protein SEQ ID NO 991 . 


933 


70 


546 


ABL41227_ 
aal 


Homo 
sapiens 


SWIT- Human G-protein coupled receptor 
encoding cDNA SEQ ID NO 8. 


585 


58 


546 


AAS16914_ 
aal 


Homo 
sapiens 


PEKE Human G-protein coupled receptor 
(GPCR) cDNA. 


585 


58 


547 


gi20067221 


Homo 
sapiens 


Down syndrome cell adhesion molecule 2 


11077 


100 


547 


gi 18033452 


Homo 
sapiens 


Down syndrome cell adhesion molecule 
DSCAML1 


10745 


99 


547 


AAM39040 


Homo 
sapiens 


HYSE- Human polypeptide SEQ ID NO 
2185. 


9116 


100 


548 


gi 12656633 


Homo 
sapiens 


transmembrane gamma-carboxyglutamic 
acid protein 3 TMG3 


1192 


100 


548 


AAM93243 


Homo 
sapiens 


HEL1- Human polypeptide, SEQ ID NO: 
2675. 


1186 


99 


548 


gi20977032 


Xenopus 
laevis 


mitotic phosphoprotein 77 


359 


38 


549 


AAG89138 


Homo 
sapiens 


GEST Human secreted protein, SEQ ID NO: 
258. 


709 


74 


549 


AAE13062 


Homo 
sapiens 


AMGE- Human CD20/IgE-receptor like 
protein, agp-96614-al. 


709 


74 


549 


gil 1559214 


Homo 
sapiens 


MS4A5 


709 


74 


550 


AAG72074 


Homo 
sapiens 


YEDA Human olfactory receptor 
polypeptide, SEQ ID NO: 1755. 


1853 


100 


550 


AAG71493 


Homo 
sapiens 


YEDA Human olfactory receptor 
polypeptide, SEQ ID NO: 1 174. 


1853 


100 


550 


gil 2054409 


Homo 
sapiens 


olfactory receptor 


1853 


100 


551 


AAB47932 


Homo 
sapiens 


SEIN/ Human Na+-driven C1-/HC03- 
exchanger. 


5677 


99 


551 


gil 1275360 


Homo 
sapiens 


NCBE 


5677 


99 


551 


gil 1182364 


Mus 

musculus 


NCBE 


5542 


96 ! 


552 


AAE04178 


Homo 
sapiens 


HI) MA- Human gene 3 encoded secreted 
protein fragment, SEQ ID NO: 169. 


1111 


98 


552 


AAE04127 


Homo 
sapiens 


HUMA- Human gene 3 encoded secreted 
protein HSDJL42, SEQ ID NO:l 14. 


1078 


98 


552 


AAE04102 


Homo 
sapiens 


HUMA- Human gene 3 encoded secreted 
protein HSDJL42, SEQ ID NO:88. 


1068 


98 
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277 


PR00217 


43 KD POSTSYNAPTIC PROTEIN 
SIGNATURE 


PR00217C 10.91 3.753e-10 235-250 


278 


PR00217 


43 KD POSTSYNAPTIC PROTEIN 
SIGNATURE 


PR00217C 10.91 3.753e-10 21 1-226 


281 


PD01572 


PHOTOSYSTEM II REACTION 
CENTRE T PROTEIN PHOTOS. 


PD0I572 8.77 4.083e-09 1-30 


282 


BL00421 


Transmembrane 4 family proteins. 


BL00421E 20.97 4.000e-20 137-166 
BL00421C 12.89 6.57 le- 12 77-88 
BL00421A 11.79 1.563e-ll 7-25 


282 


PR00259 


TRANSMEMBRANE FOUR FAMILY 
SIGNATURE 


PR00259D 13.50 8.200e- 12 140-166 
PR00259C 16.40 1.684e-09 13-41 
PR00259A 9.27 4.405e-09 1 1-34 


282 


PR00218 


PERIPHERIN (RDS)/ROM-l FAMILY 
SIGNATURE 


PR00218D 6.22 4.894e-09 76-104 


286 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 1 1.48 5.355e-09 373-397 


290 


PR00970 


ARGININE ADP- 

RIBOSYLTRANSFERASE 

SIGNATURE 


PR00970A 17.73 6.906e-21 30-51 
PR00970D 9.96 8.920e-20 133-149 
PR00970F 12.30 9.250e- 15 199-215 
PR00970E 11.23 1.265e-14 178-193 
PR00970G 9.97 3.700e-14 220-235 
PR00970C 11.05 7.000e-14 90-104 
PR00970B 16.37 7.387e-13 59-77 


290 


BL01291 


NAD:arginine ADP-ribosyltransferases 
proteins. 


BL01291F 23.30 5.974e-40 180-232 
BL01291D 19.99 9.471e-31 115-148 
BL01291A 22.07 4.892e-26 29-58 
BL01291C 14.06 7.387e-17 87-102 
BL01291G 15.18 4.176e-16 243-261 
BL01291B 9.15 2.800e-l 1 69-82 
BL0129IE7.03 1.000e-09 161-170 


292 


BL00983 


Ly-6 / u-PAR domain proteins. 


BL00983C 12.69 4.326e-10 92-107 


292 


BL00272 


Snake toxins proteins. 


BL00272C 8.27 9.372e-09 96-107 


294 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290B 13.17 9.308e-15 168-185 
BL00290A 20.89 1.450e-12 129-151 


295 


BL00571 


Amidases proteins. 


BL00571 25.69 4.1 88e-31 195-246 


296 


BL01271 


Sodium:su)fate symporter family 
proteins. 


BL0127ID 25.26 L000e-40 505-559 
BL01271C 13.62 6.824e-21 432-453 
BL01271B 12.02 9.206e-21 240-264 
BL01271A 8.06 8.800e-20 131-150 


298 


PD00131 


ATP-BINDING TRANSPORT 
TRANSMEMBR. 


PD00131B 34.97 9.308e-32 480-533 
PD00131C 19.59 1.000e-29 628-665 


298 


BL002U 


ABC transporters family proteins. 


BL0021 IB 13.37 7.750e-29 580-61 1 
BL00211A 12.23 2.588e- 10 474-485 


298 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 6.838e-09 469-486 


304 


PD01572 


PHOTOSYSTEM II REACTION 
CbN I Kb 1 PRO 1 hIN rnUlUb. 


PD01572 8.77 4.083e-09 1-30 


308 


BL00942 


glpT family of transporters proteins. 


BL00942B 20.36 1.750e-10 82-124 
BL00942F 15.07 1. 77 le- 10 339-356 
BL00942C 14.04 6.610e-09 171-190 


308 


PD02963 


COMPONENT 

PHOSPHOTRANSFERASE SYST. 


PD02963B 5.41 6.776e-09 342-357 


309 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 5.909e-21 59-80 


309 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 9.743e-13 90-129 


309 


PR00237 


RHODOPSIN-LIKE GPCR 


PR00237B 13.50 9.280e-12 59-80 
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SUPERFAMILY SIGNATURE 


PR00237C 15.69 6.914e-10 104-126 
PR00237A 1 1.48 4.774e-09 26-50 


311 


PR0O254 


NICOTINIC ACETYLCHOLINE 
RECEPTOR SIGNATURE 


PR00254A 1 1.23 5.765e-14 64-80 
PR00254D 15.50 2.023e-12 134-152 
PR00254B 12.97 1.973e-l 1 98-1 12 


311 


BL00236 


Neurotiansmitter-gated ion-channels 
proteins. 


BL00236A 21.96 5.050e-25 57-94 
BL00236C 25.16 7.097e-25 139-177 
BL00236D 25.66 8.105e-21 223-264 
BL00236B 14.67 3.81 3e-ll 111-120 


311 


PR00252 


NEUROTRANSMITTER-GATED 
ION CHANNEL FAMILY 
SIGNATURE 


PR00252A 14.28 5.696e-14 77-93 
PR00252C 17.49 9.775e-12 154-168 
PR00252B 15.17 2.406e-10 110-121 


312 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327B 19.84 2.091e-09 144-165 


312 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 7.652e-09 291-300 


313 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 1 1.19 8.043e-10 164-177 
PR00019B 11.36 7.120e-09 136-149 


313 


BL00240 


Receptor tyrosine kinase class III 
proteins. 


BL00240B 24.70 7.319e-09 319-342 


316 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 2.600e-10 45-84 


316 


PR00534 


MELANOCORTIN RECEPTOR 
FAMILY SIGNATURE 


PR00534A 1 1.49 9.446e-10 6-18 


316 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245C 7.84 4.750e-18 193-208 
PR00245A 18.03 4.808e-15 14-35 
PR00245E 12.40 9.043e-ll 246-260 
PR00245B 10.38 2.102e-09 132-146 


316 


PR00237 


RHODOPSIN-LIKE GPCR ; 
SUPERFAMILY SIGNATURE 


PR00237C 15.69 8.875e-09 59-81 


320 


PR00518 


5-HYDROXYTRYPTAMINE 5A 
RECEPTOR SIGNATURE 


PR00518D 8.59 9.471e-21 230-246 
PR00518E 1 1.20 8.898e-12 246-255 
PR00518C 5.94 1.000e-ll 180-188 


320 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237C 15.69 4.462e- 19 118-140 
PR00237G 19.63 7.26 le- 16 317-343 
PR00237F 13.57 1.857e-15 280-304 
PR00237E 13.03 4.600e-14 198-221 
PR00237D8.94 1.900e-U 154-175 
PR00237B 13.50 7.517e-l 1 72-93 


320 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.938e-27 104-143 
BL00237C 13.19 2.500e-17 275-301 
BL00237D 1 1.23 5.846e-l 1 327-343 
BL00237B 5.28 6.727e-09 206-217 


321 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 8.714e-12 17-41 
PR00237G 19.63 4.600e-ll 291-317 
PR00237B 13:50 3.531e-10 50-71 


326 


PR00007 


COMPLEMENT CI Q DOMAIN 
SIGNATURE 


PR00007B 14.16 6.657e-15 152-171 
PR00007C 15.60 2.047e-14 200-221 
PR00007A 19.33 8.412e-12 125-151 


326 


BL00415 


Synapsins proteins. 


BL00415N 4.29 7.307e-09 63-106 


326 


BL01113 


Clq domain proteins. 


BL01113B 18.26 3.647e-27 131-166 
BL01113A 17.99 1.000e-13 68-94 
BL01113C 13.18 2.532e-13 200-219 
BL01 1 13A 17.99 7.08U-13 59-85 
BL01 113A 17.99 8.297e-13 56-82 
BL01 1 13A 17.99 3.538e-12 65-91 
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BL01 1 13A 17.99 5.385e-12 71-97 
BL01 1 13A 17.99 5.909e-l 1 74-100 
BL01 1 13A 17.99 8.773e-l 1 62-88 
BL01 1 13A 17.99 9.135e-09 53-79 


326 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 4.808e-12 56-84 
BL00420A 20.42 8.967e-10 53-81 
BL00420A 20.42 7.231e-09 71-99 
BL00420A 20.42 9.169e-09 77-105 


330 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237E 13.03 6.400e-12 76-99 
PR00237D 8.94 1.450e-l 1 26-47 


330 


BL00237 


G-protein coupled receptors proteins. 


BL00237C 13.19 7.000e-09 114-140 
BL00237B 5.28 9.182e-09 84-95 


333 


BL00943 


Cytochrome c oxidase assembly factor 
COXlO/ctaB/cyoE signatur. 


BL00943A 22.06 6.087e-17 1 17-155 


334 


PD00866 


GLYCOPROTEIN PROTEIN SPIKE 
E2 PRECURSOR PEPLOMER. 


PD00866L 3.73 6.902e-09 172-181 


338 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237C 15.69 5.371e-10 103-125 


338 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 2.473e-14 58-79 
PR00245B 10.38 5.500e-13 176-190 
PR00245E 12.40 2.149e-l 1 290-304 
PR00245D 10.47 5. 8 14e- 10 273-284 


338 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.818e-14 89-128 
BL00237D 1 1.23 5.364e-09 281-297 


339 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237C 15.69 5.371e-10 103-125 


339 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 2.473e-14 58-79 
PR00245B 10.38 5.500e-13 176-190 
PR00245D 10.47 5.814e-10 273-284 


339 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.818e-14 89-128 
BL00237D 1 1.23 5.364e-09 281-297 


340 


PR00878 


CHOLINESTERASE SIGNATURE 


PR00878F 5.37 4.780e- 1 3 523-535 | 


340 


BL00122 


Carboxylesterases type-B serine 
proteins. 


BL00122E 22.02 1.563e-25 254-294 
BL00122A 12.04 5.929e- 16 69-89 
BL00122D 12.53 4.484e-14 230-245 
BL00122B 16.84 5.800e-14 139-149 
BL00122G 11.67 8.615e-13 561-571 
BL00122C 7.91 3.1 18e-l 1 201-21 1 
BL00122F 11.10 3.000e-10 306-315 


340 


BL01173 


Lipolytic enzymes G-D-X-G family, 
histidine. 


BL01 173A 9.41 5.245e-10 203-215 


341 


BL00649 


G-protein coupled receptors family 2 
proteins. 


BL00649C 17.82 6.564e-13 71 1-736 


341 


PR00249 


SECRETIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00249C 17.08 4.323e-10 713-736 


.>*» 1 


or ai i 07 


Calcium-binding EGF-like domain 
proteins pattern proteins. 


pi All QTD 10 f\A O 715ft AQ 100 11*7 


342 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 5.629e-13 90-129 


342 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 2.565e-17 59-80 
PR00245E 12.40 9.735e- 13 226-240 
PR00245C7.84 3.591e-09 174-189 


343 


PF00954 


S-locus glycoprotein family. 


PF00954E 23.75 6.798e-09 152-202 


343 


BL00246 


Wnt-1 family proteins. 


BL00246E 20.32 8.306e-09 141-186 


344 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 9.455e-14 93-132 


344 


PR00245 


OLFACTORY RECEPTOR 


PR00245A 18.03 1.000e-l8 62-83 i 
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SIGNATURE 

• 


PR00245B 10.38 9.143e-16 180-194 
PR00245C 7.84 1 .360e-13 241-256 
PR00245E 12.40 7.882e-13 294-308 
PR00245D 10.47 l.OOOe- 10 277-288 


344 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237C 15.69 4.600e-10 107-129 
PR00237G 19.63 1.209e-09 275-301 


345 


PR00249 


SECRETIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00249C 17.08 9. 1 29e- 1 1 464^87 
PR00249E 14.90 4.493e- 10 549-574 


345 


BL00649 


G-protein coupled receptors family 2 
proteins. 


BL00649C 17.82 6.073e-l 3 462-487 
BL00649E 15.34 2.857e-12 549-578 
BL00649G 13.52 8.826e-ll 722-747 
BL00649B 20.68 8.548e-09 406-451 


345 


BL01187 


Calcium-binding EGF-Jike domain 
proteins pattern proteins. 


BL01187B 12.04 7.600e-ll 87-102 
BL01 187A 9.98 1.000e-08 68-79 


346 


PR00249 


SECRETIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00249C 17.08 9.129e-ll 368-391 
PR00249E 14.90 4.493e-10 453-478 


346 


BL00649 


G-protein coupled receptors family 2 
proteins. 


BL00649C 17.82 6.073e- 13 366-391 
BL00649E 15.34 2.857e-12 453-482 
BL00649G 13.52 8.826e-ll 626-651 
BL00649B 20.68 8.548e-09 310-355 


355 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 9.500e-ll 144-157 
PR00019A 11.19 5.696e-10 147-160 
PR00019B 11.36 6.400e-10 95-108 
PR00019B 11.36 5.320e-09 119-132 


355 


PR00014 


FIBRONECTIN TYPE 10 REPEAT 
SIGNATURE 


PR00014C 15.44 8.043e-09 435-453 


357 


BL00427 


Disintegrins proteins. 


BL00427 13.93 9.384e-24 443-497 


357 


PR00289 


DISFNTEGRIN SIGNATURE 


PR00289A 13.62 4.000e-14 457-476 
PR00289B 1 1.79 6.745e-l 1 486-498 


357 


BL00142 


Neutral zinc metal lopepridases, zinc- 
binding region proteins. 


BL00142 8.38 2.125e-10 343-353 


358 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270C 19.54 4.919e-14 116-144 
PD01270B 22.18 4.462e-10 73-109 


359 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270C 19.54 4.919e-14 110-138 
PD01270B 22.18 4.462e-10 67-103 


368 


PR00463 


E-CLASS P450 GROUP I 
SIGNATURE 


PR00463E 17.37 4.667e-l2 344-370 


368 


PR00385 


P450 SUPERFAMILY SIGNATURE 


PR00385A 14.97 1.783e-13 335-352 
PR00385B 10.22 5.950e-12 353-366 


368 


PR00464 


E-CLASS P450 GROUP II 
SIGNATURE 


PR00464C 18.84 7.750e-22 324-352 
PR00464A 20.47 7.300e-17 149-169 
PR00464D 17.40 6.538e- 14 353-370 
PR00464B 20.41 l.OOOe- 11 205-223 


368 


PR00408 


MITOCHONDRIAL P450 
SIGNATURE 


PR00408D 15.44 8.099e-09 335-352 


11 c\ 
J fv 


rKvUUUl 


LUAUULAI lUN rAUJUK OLA 

DOMAIN SIGNATURE 


nn t\f\f\f\ i n in *7C o aaa a i c i/\ o*> 

PR00001B 10.75 y.u00e-15 70-83 
PR00001A 12.78 5.800e-10 56-69 


371 


BL00406 


Actins proteins. 


BL00406D 12.58 3.143e-19 257-311 
BL00406A 9.95 5.729e-13 1549 
BL00406B 5.47 7.429e-12 51-105 
BL00406C6.75 9.682e-12 110-164 


371 


PR00735 


GLYCOSYL HYDROLASE FAMILY 
8 SIGNATURE 


PR00735D 12.75 1.000e-08 363-374 


377 


BL00120 


Lipases, serine proteins. 


BL00120B 11.37 1.383e-10 124-138 


377 


PR00793 


PROLYL AMINOPEPTIDASE (S33) 1 


PR00793C 12.24 9.500e-09 128-142 
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FAMILY SIGNATURE 




378 


BL00120 


Lipases, serine proteins. 


BL00120B 11.37 1.383e-10 124-138 


378 


PR00793 


PROLYL AMINOPEPTIDASE (S33) 
FAMILY SIGNATURE 


PR00793C 12.24 9.500e-09 128-142 


382 


PR00761 


BINDIN PRECURSOR SIGNATURE 


PR00761E 14.32 1.663e-09 188-206 


388 


PR00420 


AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 4.638e-13 15-37 


388 


PR00757 


FLAVIN-CONTAINING AMINE 
OXIDASE SIGNATURE 


PR00757A 6.64 1.414e-10 15-34 


388 


PR00419 


ADRENODOXIN REDUCTASE 
FAMILY SIGNATURE 


PR00419A 14.89 4.094e-10 15-37 


388 


PR00072 


MALIC ENZYME SIGNATURE 


PR00072F 8.87 5.922e-09 16-32 


388 


BL00623 


GMC oxidoreductases proteins. 


BL00623A 12.60 8.200e-09 15-33 


388 


PR0O368 


FAD-DEPENDENT PYRIDINE 
NUCLEOTIDE REDUCTASE 
SIGNATURE 


PR00368A 17.76 9.839e-09 15-37 


396 


BL00031 


Nuclear hormones receptors DNA- 
binding region proteins. 


BL00031A 19.55 9.471e-34 102-134 
BL00031B 22.25 2.216e-22 135-166 


396 


PR00398 


STEROID HORMONE RECEPTOR 
SIGNATURE 


PR00398A 14.44 3.328e-16 102-1 19 
PR00398C 13.47 1.450e-10 143-161 


396 


PR00350 


VITAMIN D RECEPTOR 
SIGNATURE 


PR00350B9.35 2.125e-12 119-138 
PR00350F 8.61 4.385e-10 399-422 
PR00350A 10.48 7.871e-09 102-118 


396 


PR00047 


C4-TYPE STEROID RECEPTOR 
ZINC FINGER SIGNATURE 


PR00047A 15.70 5.500e-19 102-118 
PR00047B 7.63 4.522e-17 1 18-133 
PR00047D 13.53 9.550e-10 158-166 
PR00047C5.40 8.788e-09 150-158 


398 


PD01672 


+ TRANSPORT EXCHANGER NA H 
TRANS. 


PD01672B 15.16 1.115e-24 125-173 
PD01672D 10.50 5.275e-l 8 207-243 
PD016721 17.98 5.939e-16 402^48 
PD01672G 15.27 1.600e- 12 318-351 
PD01672C 16.18 3.933e-12 172-206 
PD01672H 22.99 4.949e-10 355-401 


403 


PD02797 


HYDROLASE CELL WALL N- 
ACETYLMURAMOYL-L-AL. 


PD02797D 19.90 9.032e-09 120-159 


405 


PR00456 


RIBOSOM AL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 8.861e-09 77-91 


411 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237C 15.69 2.575e-09 104-126 


411 


BL0O237 


G-protein coupled receptors proteins. 


BL00237A 27.68 9.419e-15 90-129 
BL00237D 1 1.23 5.636e-09 282-298 


411 


PR00896 


VASOPRESSIN RECEPTOR 
SIGNATURE 


PR00896B 9.01 7.577e-09 55-66 


411 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245C 7.84 9.053e-19 238-253 
PR00245A 18.03 7.907e-18 59-80 
PR00245E 12.40 2.731e-14 291-305 
PR00245D 10.47 8.531e-09 274-285 


412 


PR00646 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00646I 10.54 1.110e-26 301-320 
PR00646D 15.99 1.540e-26 85-103 
PR00646G 14.95 1.281e-25 173-190 
PR00646B 6.02 1.978e-25 21-40 
PR00646A 16.77 9.438e-24 4-21 
PR00646F 10.13 1.150e-23 156-173 
PR00646C 18.45 1.170e-23 49-64 
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PR00646E 9.52 5.500e-23 127-144 
PR00646H 6.32 1. 10 le-20 219-234 


412 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.789e-24 92-131 
BL00237C 13.19 9.280e-I4 227-253 
BL00237D 11.23 7.857e-l3 289-305 


412 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 

* 


PR00237C 15.69 8.800e-18 106-128 
PR00237B 13.50 2.000e-15 61-82 
PR00237G 19.63 2.800e-15 279-305 
PR00237F 13.57 1.000e-14 232-256 
PR00237E 13.03 4.333e-ll 195-218 
PR00237D 8.94 4.375e-10 142-163 


412 


PR00425 


BRADYKININ RECEPTOR 
SIGNATURE 


PR00425C 13.23 8.286e-10 92-1 1 1 


412 


PR00526 


FORMYL-METHIONYL PEPTIDE 
RECEPTOR SIGNATURE 


PR00526C 13.54 9.550e-10 100-117 


412 


PR00241 


ANGIOTENSIN n RECEPTOR 
SIGNATURE 


PR00241C 8.90 4.536e-09 1 15-122 


413 


PR00049 


WILNfS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 3.438e-12 1 17-131 


415 


PR00120 


H+-TRANSPORTING ATPASE 
fPROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e-19 802-818 


415 


PR00121 


SODIUM/POTASSIUM- 
TRANSPORTING ATPASE 
SIGNATURE 


PR00121D 16.72 1.209e-28 455-476 
PR00121I 15.47 2.500e-26 1037- 
1061 PR00121B7.83 6.786e-26 
218-238 PR00121G 6.89 8.875e-26 
941-961 PR00121H 12.14 9.100e- 
26 1003-1023 PR00121F6.70 
4.2 14e-25 874-895 PR00121C9.40 
7.652e-23 382-404 PR0012 IE 13.97 
1.563e-22 592-610 PR00121A6.71 
7.429e-19 191-205 


415 


BL00154 


E1-E2 ATPases phosphorylation site 
proteins. 


BL00154E 20.37 8.615e-38 680-720 
BL00154B 15.44 2.800e-31 420-456 
BL00154G 21.18 9.526e-30 825-858 
BL00154F 8.23 6.400e-28 799-822 
BL00154C 12.38 6.000e-23 458-476 
BL00154A 1 1.86 9.500e-l 6 276-293 
BL00154D 12.57 3.769e-13 595-605 


415 


PR00119 


P-TYPE CATION-TRANSPORTING 
ATPASE SUPERFAMILY 
SIGNATURE 


PR001 19E 8.48 6.250e-25 802-821 
PR001 19B 13.94 2.800e-20 462-476 
PR00119A 17.34 3.000e-15 302-316 
PR001 19D 9.56 3.571e-13 696-706 
PR00119C 11.01 6.143e-13 674-685 
PR001 19F 1 1.81 7.750e-13 826-838 


415 


BL01228 


Hypothetical cof family proteins. 


BL01228D 17.44 6.250e-ll 800-824 


415 


BL01047 


Heavy-metal-associated domain 
proteins. 


BL01047B 19.73 6.063e-10 808-828 


418 


BL00219 


Anion exchangers family proteins. 


BL00219K 12.73 9.883e-24 677-718 
BL00219M 9.98 5.208e-23 762-807 
BL00219H 10.06 5.034c-22 474-521 
BL00219N 10.66 7.545e-22 808-851 
BL00219B 14.47 6.104e-20 194-237 
BL00219I 6.16 9.818e-17 587-640 
BL00219G 12.86 9.697e-16 434-472 
BL00219A 17.13 1.000e-15 65-96 
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BL00219F 10.52 8.024e-15 381-404 
BL00219C 17.29 4.470e-14 239-277 
BL00219O 14.02 1.000e-13 853-892 
BL00219E 1 1.63 2.019e-10 341-380 
BL00219L 18.71 3.560e-10 719-757 


418 


PR00165 


ANION EXCHANGER SIGNATURE 


PR00165B 15.26 1.549e-13 376-396 
PR001651 10.02 2.521e-13 675-694 
PR00165E 8.63 8.859e-l 1 463-482 
PR00165F 10.39 7.674e- 10 495-513 
PR00165G 1 1.41 8.180e-09 588-607 


421 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL D1HYDROPTERIDINE. 


DM00099B 14.73 2.125e-09 455- 
464 


421 


PR00501 


KELCH REPEAT SIGNATURE 


PR00501B 18.88 8.342e-09 453-467 


421 


BL00292 


Cyclins proteins. 


BL00292B 20.31 t.000e-08 432-462 


422 


BL00599 


Aminotransferases class-II pyridoxal- 
phosphate attachment sit 


BL00599B 18.93 7.894e-12 394-422 


422 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 5.500e-09 85-99 
PR00320C 13.01 6.400e-09 186-200 
PR00320A 16.74 6.927e-09 85-99 
PR00320A 16.74 8.024e-09 186-200 


423 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 8.780e-09 862-894 


423 


PF00761 


Polyomavirus coat protein. 


PF00761A 12.61 8. 925e-09 461-485 


427 


PR00902 


VP6 BLUE-TONGUE VIRUS INNER 
CAPSED PROTEIN SIGNATURE 


PR00902J 18.54 6.400e-09 271-292 


428 


PR00902 


VP6 BLUE-TONGUE VIRUS INNER 
CAPSID PROTEIN SIGNATURE 


PR00902J 18.54 6.400e-09 271-292 


430 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 4.273e-15 118-148 


430 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.426V13 118-136 


430 


BL00240 


Receptor tyrosine kinase class III 
proteins. 


BL00240E 11.56 6.743e-09 104-141 


432 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL005 1 8 1 2.23 6.333e-09 32-40 


435 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625D 11.93 9.077e-09 59-69 


438 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM002 1 5 19.43 6. 1 86e-09 460-492 


448 


BL00031 


Nuclear hormones receptors DNA- 
binding region proteins. 


BL00031A 19.55 5.320e-30 11-43 
BL00031B 22.25 6.604e-16 27-58 


448 


PR00350 


VITAMIN D RECEPTOR 
SIGNATURE 


PR00350A 10.48 1.692e-16 11-27 
PR00350F 8.61 6.400e-ll 290-313 
PR00350B 9.35 7.581e-ll 28-47 
PR00350E 11.55 9.693e-ll 242-261 


448 


PR00047 


C4-TYPE STEROID RECEPTOR 
ZINC FINGER SIGNATURE 


PR00047A 15.70 2.200e-l6 11-27 
PR00047B 7.63 3.813e-16 27-42 
PR00047C 5.40 5.000e- 10 42-50 
PR00047D 13.53 6.850e-10 50-58 


448 


PR00546 


THYROID HORMONE RECEPTOR 
SIGNATURE 


PR00546H 16.85 6.523e-09 169-188 


448 


PR00398 


STEROID HORMONE RECEPTOR 
SIGNATURE 


PR00398A 14.44 7.750e-14 11-28 
PR00398C 13.47 4.857e-09 35-53 
PR0O398F 13.87 7.943e-09 150-169 


449 


PR00205 


CADHERIN SIGNATURE 


PR00205B 1 1.39 2.473e-10 217-234 
PR00205B 11.39 8.69 le- 10 321-338 


449 


BL00232 


Cadherins extracellular repeat proteins 


BL00232B 32.79 5.279e-20 219-266 
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domain proteins. 


BL00232C 10.65 6.268e-12 217-234 
BL00232C 10.65 9.308e- 10 321-338 


449 


PR00291 


SOYBEAN TRYPSIN INHIBITOR 
(KUNITZ-TYPE) SIGNATURE 


PR00291A 19.85 9.366e-09 225-254 


449 


PR00649 


GPR6 ORPHAN RECEPTOR 
SIGNATURE 


PR00649B8.21 1.000e-08 252-269 


452 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306B 5.57 9.000e-09 52-62 


457 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290B 13.17 7.750e- 19 52-69 


458 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 4.966e-13 59-80 
PR00245B 10.38 8.875e-13 177-191 


458 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 5 J00e-12 90-129 


458 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAM1LY SIGNATURE 


PR00237B 13.50 2.688e-10 59-80 
PR00237C 15.69 7.171e-10 104-126 
PR00237A 1 1.48 2.161e-09 26-50 


464 


BL00427 


Disintegrins proteins. 


BL00427 13.93 7.592e-26 379-433 


464 


PR00138 


MATRIXIN SIGNATURE 


PR00138D 16.56 5.101e-ll 278-303 


464 


BL00142 


Neutral zinc metal lopepti da ses, zinc- 
binding region proteins. 


BL00142 8.38 7.545e-l 1 278-288 


464 


PR00289 


DISINTEGRIN SIGNATURE 


PR00289A 13.62 2.500e-14 393-412 
PR00289B 1 1.79 4.226e-l0 422-434 


464 


PR00480 


ASTACIN FAMILY SIGNATURE 


PR00480B 15.41 8.909e- 10 273-291 


464 


PR00907 


THROMBOMODULIN SIGNATURE 


PR00907E 1 1.70 3.647e-09 591-613 


464 


BL00546 


Matrixins cysteine switch. 


BL00546C 16.41 4.255e-09 272-303 


464 


BL00024 


Hemopexin domain proteins. 


BL00024D 17.28 5. 596e-09 272-303 


466 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 1.000e-08 9-28 


470 


PR00211 


GLUTELIN SIGNATURE 


PR0021 IB 0.86 5.673e-10 522-542 


470 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.607e-09 591-603 


470 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 4.051e-09 522-554 
DM00215 19.43 6.644e-09 512-544 ! 
DM00215 19.43 9.085e-09 531-563 ! 


474 


PR00220 


SYNAPTOPHYSIN/SYNAPTOPORIN 
FAMILY SIGNATURE 


PR00220D 8.32 7.585e-26 131-154 
PR00220C 11.05 4.477e-25 99-123 
PR00220A 10.93 8.244e~24 36-58 
PR00220E 3.46 6.932e-23 197-215 


474 


BL00604 


Synaptophysin / synaptoporin proteins. 


BL00604E 8.32 1.444e-23 182-223 
BL00604B 9.95 1.329e-19 86-1 15 
BL00604C 14.66 5.639e-12 116-147 
BL006O4D 12.28 5.410e-ll 148-182 


476 


PR00785 


NUCLEAR TRANSLOCATOR 
SIGNATURE 


PR00785H 15.80 7.692e-09 151-167 


477 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 7.300e-19 62-83 
PR00245C 7.84 8.579e-19 241-256 
PR00245D 10.47 4.000e-15 277-288 
PR00245B 10.38 4.405e-12 180-194 
PR00245E 12.40 1.509e- 10 294-308 


477 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 6.143e-13 93-132 
BL00237D 11.23 5.09 le-09 285-301 


478 


BL00297 


Heat shock hsp70 proteins family 
proteins. 


BL00297D 11.95 8.835e-09 86-125 


481 


BL00219 


Anion exchangers family proteins. 


BL00219E 11.63 4.838e-24 376-415 
BL00219K 12.73 9.883e-24 715-756 



WO 03/025148 



PCT/US02/29964 



189 
Table 3 



SEQID 

NO: 


Database 
entry ID 


Description 


Results* 








BL00219M 9.98 5.208e-23 800-845 
BL00219H 10.06 5.034e-22 509-556 
BL00219N 10.66 7.545e-22 846-889 
BL00219B 14.47 6.104e-20 218-261 
BL002191 6.16 9.818e-17 625-678 
BL00219G 12.86 9.697e- 16 469-507 
BL00219F 10.52 8.024e-15 416-439 
BL00219C 17.29 4.470e-14 263-301 
BL00219O 14.02 1.000e-13 891-930 
BL00219L 18.71 9.422e-10 757-795 


481 


PR00165 


ANION EXCHANGER SIGNATURE 


PR00165A 9.84 8.000e-18 386-408 
PR00165B 15.26 1.549e-13 41 1-431 
PR001651 10.02 2.521e-13 713-732 
PR00165E 8.63 8.859e-ll 498-517 
PR00165F 10.39 7.674e-10 530-548 
PR00165G 11.41 8. 180e-09 626-645 


486 


PR00237 


RHODOPSIN-UKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237G 19.63 2.552e-13 260-286 
PR00237B 13.50 3.045e-13 50-71 
PR00237F 13.57 1.000e-10 218-242 
PR00237A 11.48 9.333e-10 17-41 
PR00237C 15.69 2.800e-09 95-1 1 7 


486 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 3.032e-15 81-120 
BL00237C 13.19 2.324e-10 213-239 
BL00237D 1 1.23 2.607e-10 270-286 
BL00237B 5.28 7.136e-09 185-196 


490 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 7.618e-14 67-91 


491 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 8.364e-14 59-80 
PR00245C 7.84 5.500e-12 237-252 
PR00245B 10.38 4.600e- 11 177-191 
PR00245E 12.40 9.830e-10 290-304 


491 


PR00237 


RHODOPSIN-LUCE GPCR 
SUPERFAMILY SIGNATURE 


PR00237G 19.63 3.605e-10 271-297 
PR00237C 15.69 6.175e-09 104-126 


491 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 5.371e-13 90-129 
BL00237D 1 1.23 9.455e-09 281-297 


493 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.364.150e-10 117-130 
PR00019B 11.36 9.l00e-10 141-154 
PR00019A 11.19 8.000e-09 120-133 


493 


PR00500 


POLYCYSTIC KIDNEY DISEASE 
PROTEIN SIGNATURE 


PR00500B 7.74 9.337e-09 225-245 


495 


BL00379 


CDP-alcohol phosphaudyltransferases 
proteins. 


BL00379 24.64 8.855e-16 104-140 


500 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL007901 20.01 9.550e-10 107-137 


501 


BL00031 


Nuclear hormones receptors DNA- 
binding region proteins. 


BL00031B 22.25 6.538e-34 277-308 


501 


PR00047 


C4-TYPE STEROID RECEPTOR 
ZINC FINGER SIGNATURE 


PR00047C 5.40 3.250e-14 292-300 
PR00047D 13.53 3.250e-12 300-308 


501 


PR00398 


STEROID HORMONE RECEPTOR 
SIGNATURE 


PR00398C 13.47 5.299e-14 285-303 
PR00398G 15.17 7.08le-09 388-408 


504 


PR00500 


POLYCYSTIC KIDNEY DISEASE 
PROTEIN SIGNATURE 


PR00500A 5.70 8.768e-l0 55-73 


504 


PD02382 


RECEPTOR CHAIN PRECURSOR 
TRANSME. 


PD02382B 4.60 3.100e-09 263-269 


504 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL00790I 20.0 1 7.643e-09 535-565 j 
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505 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 6.870e-24 101-122 
PR00245C 7.84 2.42 le- 19 280-295 
PR00245E 12.40 8.714e- 16 333-347 
PR00245D 10.47 6.786e-13 316-327 
PR00245B 10.38 6.906e-13 219-233 


505 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 8.839e-15 132-171 
BL00237D 11.23 2.364e-09 324-340 


505 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237B 13.50 1.750e-09 101-122 
PR00237C 15.69 4.600e-09 146-168 
PR00237A 11.48 5.065e-09 68-92 








PR00237G 19.63 5. 605e-09 314-340 


505 


PR00023 


ZONA PELLUCIDA SPERM- 
BINDING PROTEIN SIGNATURE 


PR00023E 22.27 9.813e-09 170-187 


507 


PR00722 


CHYMOTRYPSIN SERINE 
PROTEASE FAMILY (SI) 
SIGNATURE 


PR00722A 12.27 4.960e-15 244-259 
PR00722C 10.87 2.929e-14 509-521 


507 


BL00134 


Serine proteases, trypsin family, 
histidine proteins. 


BL00134B 15.99 3.571e-19 510-533 
BL00134A 1 1.96 3.160e-17 243-259 
BL00134C 13.45 3.250e-13 546-559 


507 


BL00495 


Apple domain proteins. 


BL00495N 1 1.04 4.729e-24 502-536 
BL00495O 13.75 6.127e-15 537-565 
BL00495M 8.50 6.400e-12 429-463 


507 


BL01253 


Type I fibronectin domain proteins. 


BL01253H 13.15 8.364e- 19 528-562 
BL01253G 11.34 1.574e- 17 509-522 
BL01253F 14.35 6.850e-14 465-503 
BL01253E 16.01 8.861e- 14 427-463 
BL01253D 4.84 6.400e-10 243-256 


507 


BL00021 


Kringle domain proteins. 


BL00021D 24.56 8.500e-28 518-559 
BL00021B 13.33 5.154e-15 243-260 
BL00021C 22.21 6.943e-09 438-459 


509 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 6.657e-15 246-265 
PR00OO7C 15.60 2.047e-14 294-315 
PR00007A 19.33 8.412e-12 219-245 


509 


BL00415 


Synapsins proteins. 


BL00415N 4.29 7.307e-09 157-200 


509 


BLOl 113 


CI q domain proteins. 


BLOl 1 13B 18.26 3.647e-27 225-260 
BL01113A 17.99 1.000e-13 162-188 
BLOl 1 13C 13.18 2.532e-13 294-313 
BLOl 1 13A 17.99 7.081e-13 153-179 
BLOl 1 13A 17.99 8.297e-13 150-176 
BLOl 1 13A 17.99 3.538e-12 159-185 
BL01113A 17.99 5.385e-12 165-191 
BLOl 1 13A 17.99 5.909e-l 1 168-194 
BL01113A 17.99 8.773e-ll 156-182 
BL01113A 17.99 9.135e-09 147-173 


509 


BL00420 


Speract receptor repeat proteins domain 
pro terns. 


BL00420A 20.42 4.808e-12 150-178 
BL004/UA 2U.42 5.967e-10 147-175 
BL00420A 20.42 7.231e-09 165-193 
BL00420A 20.42 9.169e-09 171-199 


513 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 9.486e- 13 92-131 


513 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 6.714e-12 61-82 
PR00245C 7.84 8.000e-10 240-255 


513 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 5.355e-09 28-52 
PR00237C 15.69 9.550e-09 106-128 


516 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237G 19.63 2.543e-ll 665-691 
PR00237A 1 1.48 3. OOOe- 10 419-443 
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516 


PR00373 


GLYCOPROTEIN HORMONE 
RECEPTOR SIGNATURE 


PR00373D 11.16 2.403e-09 498-5 1 2 


516 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 6.600e-10 491-530 
BL00237D 1 1.23 4.545e-09 675-691 


516 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 7.300e-ll 210-223 
PR00019A 11.19 8.043e-10 280-293 
PR00019B 1 1.36 5320e-09 207-220 


516 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 7.429e-09 395-407 


519 


BL00649 


G-protein coupled receptors family 2 
proteins. 


BL00649C 17.82 6.564e-13 578-603 


519 


PR00249 


SECRETIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00249C 17.08 4.323e-10 580-603 


521 


PR00176 


SODIUM/NEUROTRANSMITTER 
SYMPORTER SIGNATURE 


PR00176C 10.84 2.667e-24 142-168 
PR00176A 16.82 5.500e-23 69-90 
PR00176B 7.31 9.308e-17 98-1 17 


521 


BL00610 


Sodium: neurotransmitter symporter 
family proteins. 


BL00610A 17.73 1.000e-40 69-1 18 
BL00610B 23.65 1.000e-40 133-182 
BL00610C 12.94 6.157e-14 226-277 


524 


PR0O237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237B 13.50 7.750e-14 93-1 14 
PR00237C 15.69 1.667e-12 140-162 
PR00237F 13.57 8.333e-12 278-302 
PR00237E 13.03 6.667e-l 1 229-252 
PR00237D 8.94 7.750e-10 174-195 


524 


BL00419 


Photosystem I psaA and psaB proteins. 


BL00419L 20.03 7.850e-09 11-59 


524 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 3.739e-20 126-165 
BL00237C 13.19 4.808e-13 273-299 
BL00237B 5.28 8.773e-09 237-248 


526 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237D 8.94 2.000e-09 171-192 


526 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 3.020e-09 121-160 


526 


PR00641 


EBIl ORPHAN RECEPTOR 
SIGNATURE 


PR00641E 10.22 8.975e-09 119-136 


527 


BL00519 


Bacterial regulatory proteins, asnC 
family proteins. 


BL00519C 29.50 6.595e-09 110-154 


531 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 8.258e-15 143-182 


531 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 7J75e-ll 81-105 
PR00237B 13.50 4.094e-10 113-134 
PR00237C 15.69 2.575e-09 157-179 


532 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 2.029e-13 111-150 


532 


PR00245 ' 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 9.000e-23 80-101 
PR00245C 7.84 3.543e-14 259-274 
PR00245B 10.38 9.357e-14 198-212 
PR00245E 12.40 8.286e-12 312-326 


532 


PR00237 


RHODOPSIN-LIKE GPCR 

CT IPPDPAMH V QTrfTM ATI TQX2 


PR00237A 1 1.48 2.161e-09 47-71 

DDArtinr 1 1< AQ A KAo Aft IOC 1 At 


533 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 1.000e-17 603-624 


534 


PR00249 


SECRETIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00249C 17.08 9.129e-l 1 247-270 
PR00249E 14.90 4.493e-10 332-357 


534 


BL00649 


G-protein coupled receptors family 2 
proteins. 


BL00649C 17.82 6.073e-13 245-270 
BL00649E 15.34 2.857e-12 332-361 
BL00649G 13.52 8.826e-ll 505-530 
BL00649B 20.68 8.548e-09 189-234 


538 


PR00245 


OLFACTORY RECEPTOR 


PR00245C 7.84 6.049e-15 238-253 
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Table 3 



SEQ ID 

NO: 


Database 
entry ID 


Description 


Results* 






SIGNATURE 


PR00245A 18.03 6.192e-15 59-80 
PR00245E 12.40 4.643e- 12 291-305 
PR00245B 10.38 4.886e-10 177-191 


538 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 5.500e-12 90-129 
BL00237D 1 1.23 7.545e-09 282-298 


538 


PR00237 


RHODOPSIN-UKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237G 19.63 2.674e-09 272-298 
PR00237E 13.03 7.088e-09 199-222 
PR00237C 15.69 8.875e-09 104-126 


542 


BL00243 


Integrins beta chain cysteine-rich 
domain proteins. 


BL00243H 17.53 4.375e-10 41 1-436 


542 


PR00011 


TYPE III EGF-LDCE SIGNATURE 


PR0001 ID 14.03 3.508e-l 1 416-434 
PR00011B 13.08 4.522e-10 416-434 
PR0001 1 A 14.06 2.479e-09 416-434 


542 


PR00962 


LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962F 12.39 6.855e-09 517-536 


543 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 4.857e- 10 31-39 


544 


BL00733 


Ribosomal protein S26e proteins. 


BL00733A 11.62 8.784e-25 1-43 
BL00733B 12.04 6.870e-20 44-76 


544 


BL00127 


Pancreatic ribonuclease family proteins. 


BL00127B 26.57 3.455e-09 134-178 


546 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237B 13.50 8.3 13e-10 64-85 
PR00237D 8.94 7.000e-09 145-166 


547 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL007901 20.01 7.480e-ll 1216- 
1246 BL00790I 20.01 6.963e-10 
1115-1145 BL007901 20.01 8.988e- 
10 1314-1344 BL0O790H 13.42 
9.514e-10 1266-1291 


547 


DM00215 


PROLINE-RJCH PROTEIN 3. 


DM002 1 5 19.43 1 .305e-09 2034- 
2066 


547 


PD02870 


RECEPTOR INTERLEUKIN- 1 
PRECURSOR. 


PD02870B 18.83 8.024e-12 1408- 
1440 PD02870D 15.74 9.900e- 10 
1408-1442 PD02870B 18.83 
7.415e-09 339-371 


547 


PR00014 


FIBRONECTIN TYPE III REPEAT 
SIGNATURE 


PR00014A 8.22 3.864e-09 1265- 
1274 PR00014D 12.04 7.750e-09 
1122-1136 


547 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 8.043e-09 347-356 


547 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327B 19.84 9.591e-09 305-326 
PD02327B 19.84 9.591 e-09 676-697 


547 


BL00240 


Receptor tyrosine kinase class III 
proteins. 


BL00240B 24.70 7.907e-l 0 487-5 1 0 
BL00240B 24.70 1.000e-08 305-328 


548 


PR00001 


COAGULATION FACTOR GLA 
DOMAIN SIGNATURE 


PR00001A 12.78 2.174e-13 23-36 
PR00001B 10.75 8.364e-13 37-50 
PR00001C 16.60 6J27e-09 51-65 


550 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 2.500e-22 59-80 
PR00245C 7.84 7.000e-18 238-253 
PR00245B 10.38 7.480e-15 177-191 
PR00245E 12.40 6.029e-13 291-305 


550 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 6.182e-14 90-129 
BL00237D 11.23 7.750e- 10 282-298 


550 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237G 19.63 5.219e-12 272-298 
PR00237E 13.03 1.000e-10 199-222 
PR00237C 15.69 3.925e-09 104-126 


551 


PR00165 


ANION EXCHANGER SIGNATURE 


PR00165A 9.84 1.652e- 16 453-475 
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Table 3 



SEQID 
NO: 


Database 
entry ID 


Description 


Results* 








PR00165B 15.26 7.835e-14 478-498 
PR00165I 10.02 5.378e-12 781-800 
PR00165D 7.84 8.159e-ll 534-553 
PR00165F 10.39 8.729e-ll 597-615 
PR00165H 8.01 1.321e-10 729-749 


551 


BL00219 


Anion exchangers family proteins. 


BL00219C 17.29 7.474e-25 338-376 
BL00219N 10.66 4.575e-24 914-957 
BL00219E 1 1.63 9.471e-24 443^82 
BL00219K 12.73 2.098e-22 783-824 
BL00219B 14.47 8.571e-22 293-336 
BL00219M 9.98 7.222e-21 868-913 
BL00219H 10.06 9.693e-21 576-623 
BL00219A 17.13 4.176e-20 127-158 
BL00219I 6.16 3.106e-19 693-746 
BL00219L 18.71 3.889e- 19 825-863 
BL00219G 12.86 3.198e-17 536-574 
BL00219F 10.52 7.152e-16 483-506 
BL00219O 14.02 1. 83 5e- 1 1 959-998 
BL00219D 15.15 3.148e- 10 377-412 



♦Results include in order: accession number subtype; raw score; p-vatue; position of signature in amino acid 
sequence. 
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Table 4A 



Gi?n in 

oH\/ 11/ 

NO: 


rialll IrllMJCJ 


i/escnpuuu 


E- value 


Score 


977 
£.1 1 


«7f-PlHP4 
zi-V/jrHya 


7inr finopr CIVIC A tvnp /PTMO finof*r\ 

Zfinc linger, v jji^*t type imgci j 


5 7p-10 
j.zciu 


'Xf. 7 


778 

z/o 




z*mc Linger, oni/i type ^iviinu linger i 


s 7p in 


1ft 7 

JU. / 


770 
Z fy 


PA 

rn 


PA domain 


1 If 18 

i oe- 1 o 


7S 1 


701 

ZoZ 


transmembrane4 


Tetraspamn family 


1 7*»-4» 

l . / e-**o 


Iftl 4 
101.*r 


7117 
Zo/ 


sushi 


ousm coma in (ock repeat/ 


i.oeoo 


9H1 1 
ZU1 . 1 


OOA 
ZVU 


IDT 


NAD iargi nine ADP-ribosyltransferase 


A 7f,7 

o.De-zu/ 


7 An o 
/UU.o 


Zyz 


T ID A D T "V/C 

UrAK Liu 


mm D A D /T mi £ jLlihilIii. 

u-r ak/ly-o domain 


n ni 
U.UI 


14 0 


293 


PMP22 Claudin 


PMP-22VEMP/MP20/Claudin family 


9.4e-06 


32.5 


294 


MHC_II_alpha 


Class 11 histocompatibility antigen, alpha 
domain 


4.1e-44 


loU.U 






Amidase 


Amidase 


A Aa 1 1 

4.oe- / 1 


Z4y.D 


zyt> 


Na sulph symp 


Sod ium.sul fate symporter transmembrane region 


1 la "71 


OCQ A 

Zj<$.U 


zyo 


ABC membrane 


ABC transporter transmembrane region. 


i .oe-ju 


ZUI.Z 


ioo 
zyy 


rivirzz Claudin 


DAjfD 7 < 1/CKjTD/NjID1A//" > I4.i*1;*. TomiKf 

rIVir-ZZ/cMr/MrZU/Uiauain iamiiy 




OO 1 

-zy. j 


JUO 


Ac yl transferase 


Acyl transferase 


y.oe-uo 


1A Q 


1AO 


/tm J 


7 transmembrane receptor (rhodopsin family) 


a 1a in 
le-ju 


y f.o 




in eur_cnan_Lr> u 


Neurotransmitter-gated ion-channel ligand 
Dinuing oomain 


7 87 

z.ze-oJ 


*>on 4 


717 
J1Z 




Immunoglobulin domain 


4 7#» 7H 

*f . /e-zu 


fLQ 7 

oy. / 


Jlj 




Leucine Rich Repeat 


1 Q*» 77 


01 7 

yi.j 


114 


P lex in repeat 


riexin repeat 


fl A7 
u.UZ 


7A 7 

zu.z 


71 ^ 


riexin repeat 


Plexin repeat 


n n7 

U.UZ 


7ft 7 
ZU.Z 


71ft 


7*rri 1 


7 transmembrane receptor (rhodopsin family) 


1 Oa 9^ 

i.ze-zD 


R7 ft 
oj.O 


70A 


7fw 1 

/cm i 


7 transmembrane receptor (rhodopsin family) 


1 Qo 0< 

i .ve-y o 


in*; a 


771 

JZl • 


7tm 1 

fun i 


7 transmembrane receptor (rhodopsin family) 


7 7<i 10 


ft7 7 
Oj.Z 


777 
JZZ 


TT>D 

1 trt\. 


TDD Hnmnin 

i ri\ uomain 


4 9a ia 


ftft 7 
00. / 


17A 
JZO 




Clq domain 


z. /e-.* I 


I 1 "7 A 

I I /.4 


71ft 


/tm i 


7 transmembrane receptor (rhodopsin family) 


4 1*. 1 ^ 


<ft 1 


777 


I TK« A 


uoi/v prenynransierase iamiiy 


i.je-oz 


07 1 7 


17ft 

JjO 


7h« 1 

/on i 


7 transmembrane receptor (rhodopsin family) 




177 ft 
1ZZ.0 




7fTM 1 

/tm i 


7 transmembrane receptor (rhodopsin family) 


j.oe-jo 


1 *>o ft 
IZZ.o 




vAjesierase 


Carboxy 1 esterase 


1 Op 1 lA 


4^0 ft 

*t uy.u 




7fm 7 
/ mi £ 


7 IrancmAmnranA rA^AntAr f Cpp»Hn ramilul 

/ u <iiimi ici nui ui ic rcifcpior ^ocLrcun laiimy^ 


7 7-71 
z. je-z i 


84 4 


747 


7tm 1 


/ iruiiMucrnDnirit receptor t^rnoaopsui ianu ly ) 


7 R*> 7^ 


87 1 
oZ. 1 




7fm 1 

/ UIl 1 


/ U allolIICIIlUIallC rcvcpiui \riiuuupbin ldllUjyj 


1 "Xp 71 


1ft7 ft 
I uz.o 


145 


7rm 7 

/ UIl 


7 rrsknCTV^TYtr^mriR T&t*s*t\\f\T /Qf»r"r**tiri fniYitlvA 
/ uaioiiiviiiuiciiic ici/CfJiui ^ocucuii laiiuiyj 


7 7*»-77 


7<tft ft 


346 


7tm 2 


7 transmembrane receptor (Secretin family) 


33e-73 


256.6 


IS 1 




1 niffni IflAA 1 Ant ■ 1 i r\ /lAm7 i r\ 

immunoglobulin uomain 


o.oe-u/ 


07 1 
Z /.J 


7SS 


T PR 


I Aii/*irtA P ion O £*r\£*l t 
l^CUwlilC IVlt-il Ixm^pCol 


O. I e-Z5r 


luy.o 


357 


Reprolysin 


Reprolysin (M12B) family zinc metalloprotease 


3.7e-93 


322.9 


ICQ 


ig 


Immunoglobulin domain 


7 7a lift 

z./e-uo 


11 Q 


359 


ig 


Immunoglobulin domain 


2.7c08 


31.8 | 


ioz 


ig 


Immunoglobulin domain 


4.le-Uo 


31.2 




Folate carrier 


Reduced folate carrier E 


3.5e-l45 


vine *7 

495.7 


7ftR 




Pvtnrtimm« P45ft 


a 4 P .<7 


7ft7 1 
ZUJ. 1 


370 


gla 


Vitamin K-dependent carboxylation/garnma- 
carboxyglutamic (GLA) domain 


6.1e-15 


63.1 


371 


actin 


Actin 


5.7e-27 


89.8 


375 


TruBJM 


TruB family pseudouridylate synthase (N 
terminal domain) 


6.6e-69 


242.3 


376 


TruBN 


TruB family pseudouridylate synthase (N 
terminal domain) 


6.6e-69 


242.3 


377 


abhydrolase 


alpha/beta hydrolase fold 


0.015 


15.7 


378 


abhydrolase 


alpha/beta hydrolase fold 


l.le-10 


49.0 
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Table 4A 



Gro in 

NO- 


* la lit ItJVUCI 


i/\T2>V. I IUIIUII 


F-valup 


Score 


382 


TTL 


Tuhulin-tvrosinp lipasp familv 


4.1e-122 


419.1 




l JO con 


1 Ihinintin-rnniiioahno f»n7vmp 
VJUi^uiiiii~vv\7iijugaiiii£ wic yihv 


0.0067 


-45.5 


388 


Amino oxidase 


Flavin rontainino aminp oxida<vP. 

A Id Till VU11IAI1UU£ (UlUllw UAIUMOV 


1.3e-17 


71.9 


389 


RUN 


RUN domain 


8e-51 


182.3 


390 


R hnmKniH 

fvllUlllMsJU 


RhomKnid familv 
ivinjinuv/ivi idiiui v 


4.7e-05 


30.2 


392 


Op_pliidin 


Orchid in/FT I familv 


I.2e-ll 


46.2 


393 

.77 j 


L/Ul v 


Intpcrral mpmhranp nrnfpin OT FF6 
uiic^iai inviiii/idiic piuiviu i-/ ui u 


0.037 


14.8 


305 


PatrhpH 


Patphpd familv 
x divuvvl loiUiljr 


5.2e-105 


362.3 


396 




7inr fin opt f"*4 hnv (turn dnmainQ^ 


1.4e-44 


152.5 


308 
jyo 


^Ja W FvphnntTpr 


OnHiiim/hv/lrriopn pvcHancjpr familv 
own uiur UjfUiugviJ CAviiaiigd laimijr 


O O e .1Q3 
7.7C 1 VJ 


354.7 


407 

HvZ 




F_nov H/MT*!*in 
r-UUA UUIUalll 


0.022 


21.4 


**vW 


DAD') 


far/ SupciTalTlliy 


1 4p 30 


115 0 


406 


xdicneu 


PatpViprl fomilv 
i alL/ICU IdJIUly 


5 Rp-17 


-4 9 


HI I 


7tm 1 
/un i 


/ iiansmeninrane ICvvpiOl v rilOUOpSlIl laluLiy) 


5 4p-43 


138 7 


AM 


7*m 1 

/cm i 


/ iransnic rri Diane receptor ^rnouopsin iamiiy^ 


7 Rp 01 


707 1 

Z7Z. 1 


41 ^ 


PI P7 AXPacA 


pi C? ATPjicp 

c i -cz A i r ase 


1 1 p.1 16 

1.1 C~ 1 1 o 


3R7 0 


41ft 


nbUJ COuanSp 


I4i ill imrtrnArtfir ffimilii 

riv/Uj" transporter ramiiy 


1 7p.307 
l .zeouz 


1018 0 


421 


Kelch 


Kelch motif 


6.5e-40 


146.0 


4ZZ 


\xrr\4o 


WD domain, G-beta repeat 


/.je-i o 


66 1 
00. 1 


471 
4ZJ 


Beach 


tjeige/rJcACn aomain 


/.je-zj 


Q£ O 
50.7 




DZJr 


bZIP transcription factor 


fl AA74 


1 j.j 




pkinase 


Protein kinase domain 


1 &A 1#i 

l .oe-jo 


1 74 A 


AT) 
4JZ 


™f r , iT4r , 4 

ZI-v-orlL>t 


z.inc linger, v^^riv^** type ^kjjnvj iingerj 


0 4p_A6 


77 Q 
zz.y 




PXJTP77 /^InnHirt 

rlYLrZZ daUuin 


r Mr-zz/cMr/ivirzu/v^iauam iamiiy 


i . /e-oy 


1 44 7 


A7ft 




iviwiviN repeal 


1 4p ia 


17R 3 

I /.O.J 


447 


r/vrZ 


PA DO nmArftmilu 

rArz supenamijy 


7 Op 70 

z.ye-zy 


i in 7 


448 


hormone rec 


Li gand -binding domain of nuclear hormone 

T*pppntfYr 


1 p_41 


1 1Q O 


449 
tty 


VaUJICI III 


V^ailUCllil Ul/lIJaUl 


1 6p-37 

1 .UC*J / 


138 1 


451 


£ 1 'vAAv 




7 1p-06 


34 7 


452 


HLH 


Upliv-lnnTv-hpliv ON A-KinHino Hnmain 


2.6e-09 


44.4 


457 




fmmiinoolohiilin domain 


0 0098 

U.WU70 


13 9 

1 J.7 


458 

fJO 


7tm 1 

/ UIl 1 


7 trancnvmnTsiv fTpppntnr 1 tiioH nncin familvi 
/ ti alio J i iCTiiui at ic ivvcpLUi ^i iiLHivjJjiii laimiyj 




83 6 


463 


TITOOR 

1 Ul/V/Iv 


TiiHot domain 
i uuui uuiiiaiii 


6 6p-1 3 


56 3 

_/v. J 


464 
•tut 


P pfiml vci n 
Ivvpi VI jr 9iu 


Rpnrolv^in fN^l familv 7inr mptallnnrotpacp 
rvcjji viijr jiii yiri i JLO j laiiuijr Z4UL liivuiiiu^/i uicaoc 


3 1p-88 


306 6 


468 
too 


HEAT 


HFAT renpat 


0 0013 


75 4 


469 
tU7 


oi iFfi 


Tntporal mpmKranp r\rr\fpin Of IF6 
uiivgiai uiciiiuiaiiv piuiciii uuru 


1 4p-05 


37 0 


471 


OFNN 


OFNN f AFY-3 • domain 

lyClili l AliA J I UVlUlalll 


7 le-59 

/ . IC"J7 


709 0 

X. 1/7. VI 


474 


i. oynapiopnybiii 


kjyiiapiopiiysLu i synapioporin 


4 7p 3» 


140 0 


476 




MYWn finopr 
J vi i v*u iijigci 


4 4p-05 
*t ,*te-vj 


70 5 

cy.j 


477 


7tm 1 

r till 1 


7 IrancmpmKranp rprpnfnr / rHf\H<"ii\cin famifv^ 
/ UcUlalllvllli/l cUIC ICVvPlUl \\ 11VH1U|/2>LU lalllliyi 


7 4p-33 


10R 1 

1 I/O. 1 


481 


HC03 cotransp 


HC03- transporter ramiiy 


0 


1065.8 


487 


ank 


A nU rpnpot 
AVIIIv repeal 


1p 19 

i e- 1 y i 


70 0 


485 


i RRrr 

L.tVIVV^ 1 


I pi ip i rip ripn rpfipat f^-tprmina 1 Hnnviin 
L>CUv>lIIC Ilvll ICpval V'-ICI lillllul UUIIktlll 


1 1p.08 

1 . 1 C-l/O 


47 3 


486 


7tm 1 


7 transmembrane receptor (rhod opsin family) 


5.3e-42 


135.6 


490 


mito can* 


Mitochondrial carrier protein 


5.6e-24 


93.1 


491 


7tm 1 


7 transmembrane receptor (rhodopsin family) 


3.8e-28 


91.6 ! 


493 


LRR 


Leucine Rich Repeat 


1.7e-15 


64.9 


499 


Rap GAP 


Rap/ran-GAP 


2e-20 


81.3 


500 


ro3 


Fibronectin type III domain 


l.le-12 


55.6 


501 


horrnone_rec 


Ligand-binding domain of nuclear hormone 
receptor 


2e-46 


154.4 


503 


RhoGEF 


RhoGEF domain 


2.8e-33 


124.0 


504 


m3 


Fibronectin type III domain 


1.5e-09 


45.1 
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Table 4A 



CPA TT\ 

SEQ ID 
IN l/: 


Piam Model 


Description 


E-value 


Score 


CAC 
3U3 


/tm 1 


u — " : — - — r— 

7 transmembrane receptor (rhodopsin family) 


3.1e-45 


1 A C Q 
143.(5 


CAT 

307 


trypsin 


Trypsin 


7e-o7 


276.1 


cao 
305 


T>Vf\ 

rKJJ 


rKlJ domain 


1 T« AA 

l.2e-09 


/ICC 

43.3 


CAA 

509 


Clq 


Clq domain 


2.7e-31 


1 1*7 vl 
1 17.4 


31 3 


/tm 1 


7 transmembrane receptor (rhodopsin family) 


3.3e-12 


/IA A 

4U.9 


r if 
310 


LRK 


Leucine Rich Repeat 


7.3e-31 


1 1 C A 

1 16.0 


CIA 

319 


/tm 2 


7 transmembrane receptor (Secretin family) 


2.3e-21 


84.4 


CO 1 
321 


Ck.II? 


Sodium: neurotransmitter symporter family 


1 ~7 ~. 1 I/I 

1.7e-124 


in a 
42 /.0 


CII 
323 


ODD "V/ 

5>rKY 


orRY domain 


A O — 1A 

9.oe-20 


*7A A 

79.0 


324 


/tm I 


7 transmembrane receptor (rhodopsin family) 


C "1*. CA 

3.3e-59 


1 OA C. 

189.6 


32/ 


ratcned 


ratcned iamiiy 


A AAA11 


A 1 A A 


cii 
531 


/tm 1 


7 transmembrane receptor (rhodopsin family) 


3.le-lo 


/XA 1 


CIO 

332 


/tm 1 


7 transmembrane receptor (rhodopsin family) 


1.7e-37 


lot 1 

121.3 


533 


7tm 1 


7 transmembrane receptor (rhodopsin family) 


6.7e-10 


33.6 


534 


7tm 2 


7 transmembrane receptor (Secretin family) 


3.3e-73 


256.6 


535 


Rhomboid 


Rhomboid family 


8.5e-18 


72.6 


536 


Rhomboid 


Rhomboid family 


8.5e-18 


72.6 


338 


/tm l 


7 transmembrane receptor (rhodopsin family) 


4.oe-3o 


123.1 


542 


SEA 


SEA domain 


5.1e-10 


46.7 


543 


SPRY 


SPRY domain 


2.6e-17 


70.9 


544 


Ribosomal S26e 


Ribosomal protein S26e 


2.1e~20 


81.2 


547 


m3 


Fibronectin type III domain 


4.1e-102 


352.6 


548 


gla 


Vitamin K-dependent carboxylauon/gamma- 
carboxyglutamic (GLA) domain 


3e-15 


64.1 


550 


7tm 1 


7 transmembrane receptor (rhodopsin family) 


4e-43 


139.1 


551 


HC03 cotransp 


HC03- transporter family 


0 


1704.8 


552 


DUF6 


Integral membrane protein DUF6 


0.069 


10.4 
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Table 4B 



SEQ 
ID 


Model 


Description 


E-value 


Score 


Repeats 


Position 


277 


ZI-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


1 .6e-07 


IOC 

38.5 


1 


222-263 


277 


PA 


PA domain 


1.4e-06 


35.3 




58-144 


277 


PHD 


PHD-finger 


0.019 


5.9 


i 


221-266 


278 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
fmger) 


1.6e-07 


38.5 


1 


198-239 


278 


PA 


PA domain 


0.004 


21.3 


i 


28-120 


278 


PHD 


PHD-finger 


0.019 


5.9 




197-242 


279 


PA 


PA domain 


1.4e-18 


75.2 




58-162 


281 


Cornichon 


Cornichon protein 


4.4e-37 


136.6 




2-113 


281 


PsbT 


Photosystem II reaction centre T 
protein 


3.8 


6.4 




1-24 


282 


transmembrane 
4 


Tetraspanin family 


1.6e-24 


94.9 




10-166 


286 


sugar_tr 


Sugar (and other) transporter 


3.9 


-186.5 


1 


19-494 


286 


Na_sulph_sym 
P 


Sodium: sulfate symporter 
transmembrane 


9 


-362.5 


1 


78-453 


287 


sushi 


Sushi domain (SCR repeat) 


1.8e-56 


201.1 


4 


35-94:99- 
157:162- 
223:228- 
283 


290 


ART 


NAD:arginine ADP- 
ribosyltransferase 


1.8e-207 


702.6 


1 


1-326 


291 


PAP2 


PAP2 superfamily 


1.3 


-21.2 


1 


88-175 


292 


UPAR LY6 


u-PAR/Ly-6 domain 


0.0034 


12.8 




23-108 


292 


Keratin B2 


Keratin, high sulfur B2 protein 


0.48 


-63.3 


-j 


7-124 


293 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin 
family 


9.4e-06 


32.5 




7-169 


294 


MHC_II_alpha 


Class II histocompatibility 
antigen, alp 


4.1e-44 


160.0 


1 


29-109 


294 




Immunoglobulin domain 


0.016 


21.8 


1 


125-172 


295 


Amidase 


Amidase 


2.1e-65 


230.7 


1 


69-513 


296 


Nasulphsym 
P 


Sodiumisulfate symporter 
transmembran 


4.1e-71 


249.7 


1 


3-579 


296 


Na_H_antiporte 
r 


Na+/H+ antiporter family 


3.3 


-108.5 


1 


241-572 


296 


Peptidase C20 


Type IV leader peptidase family 


6.8 


-187.4 


1 


1-307 


296 


PH04 


Phosphate transporter family 


9 


-206.1 


1 


129-510 


298 


ABCmembran 
e 


ABC transporter transmembrane 
region 


1.7e-56 


201.1 


1 


188-459 


298 


ABC tran 


ABC transporter 


1.2e-53 


191.7 


i 


469-653 


298 


APS kinase 


Adenylylsulfate kinase 


2.6 


-117.0 




468-587 


298 


DUF258 


Protein of unknown function, 
DUF258 


3.6 


-79.4 


i 


446-596 


zyy 


rMrzz_L4auai 
n 


rMr-zZ/cMr/MrZU/Ciaudm 

family 


0.048 


-29.1 




A 1 CO 

4-168 


300 


Mtc 


Tricarboxylate carrier 


1.2e-67 


238.1 




1-236 


301 


Mab-21 


Mab-21 protein 


2.3 


-192.1 




189-524 


304 


Cornichon 


Cornichon protein 


3.4e-19 


77.2 




2-98 1 


304 


PsbT 


Photosystem II reaction centre T 
protein 


3.8 


6.4 




1-24 


305 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin 
family 


1.6 


-55.5 




1-192 


306 


Acyltransferase 


Acyltransferase 


4.9e-05 


30.2 




70-229 
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Repeats 


Position 


308 


sugar tr 


Sugar (and other) transporter 


0.33 


-155.6 


1 


9-490 


308 


PUCC 


PUCC protein 


0.6 


-253.1 


1 


93-486 


308 


Nucleoside_tra 
n 


Nucleoside transporter 


2.1 


-151.4 


1 


143-456 


308 


oxidored_ql 


NADH- 

Ubiquinone/plastoquinone 


7 


-168.7 


1 


151-478 


309 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


7.1e-05 


-4.8 


1 


41-235 


311 


Neur chan LB 
D 


Neurotransmirter-gated ion- 
channel lig 


1.4e-85 


297.7 


1 


30-236 


311 


Neurchanrne 
mb 


Neurotransmitter-gated ion- 
channel tra 


6.5e-38 


139.4 


1 


243-446 


312 


ig 


Immunoglobulin domain 


2.1e-17 


71.3 


3 


37- 

106:138- 
208:245- 
300 


313 


LRR 


Leucine Rich Repeat 


1.3e-23 


91.9 


7 


66-89:90- 

113:114- 

137:138- 

161:163- 

186:187- 

210:211- 

233 


313 


ig 


Immunoglobulin domain 


2.7e-07 


37.7 


1 


314-372 


313 


fn3 


Fibronectin type 111 domain 


2.4e-06 


34.5 




422-502 


313 


LRRCT 


Leucine rich repeat C-terminal 
domain 


5.6e-05 


30.0 


! 


252-297 


313 


LRRNT 


Leucine rich repeat N- terminal 
domain 


3.7 


8.7 


1 


33-64 


313 


APS kinase 


Adenylylsulfate kinase 


5.6 


-120.4 




541-646 


314 


PSI 


Plexin repeat 


0.02 


20.2 


i 


303-348 


315 


PSI 


Plexin repeat 


0.02 


20.2 




303-348 


316 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


4.7e-19 


76.7 


■f 


3-245 


316 


DUF40 


Domain of unknown function 
DUF40 


3.1 


-127.1 




2-206 


317 


Filamin 


Filamin/ABP280 repeat 


5.5 


-34.0 




100-192 


318 


Polysacc_synt 


Polysaccharide biosynthesis 
protein 


7 \ 


-87.4 


-j — 


107-368 


320 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


1.2e-90 


314.5 


i 


54-335 


321 


7tm 1 


7 transmembrane receptor 
(rhodopsin family) 


2.6e-08 


41.0 


1 


32-309 


321 


7tm 5 


7TM chemoreceptor 


8.3 


-169.8 




14-317 


322 


TPR 


TPR Domain 


4.3e-16 


66.9 


3 


493- 

526:527- 
560:561- 
594 


322 


PMT 


Dolichyl-phosphate-mannose- 
protein mannosylt 


3.2 


-54.0 


1 


6-245 


326 


Clq 


Clq domain 


7.3e-32 


119.3 


1 


117-241 


326 


Collagen 


Collagen triple helix repeat (20 
copies) 


3.8e-06 


33.8 


1 


50-109 


326 


Lysis_col 


Lysis protein 


9.3 


-10.9 


1 


1-36 


330 | 7tm I 


7 transmembrane receptor 


0.027 


-64.6 


1 


1-183 
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Score 


Repeats 


Position 






(rhodopsio family) 










331 


PKD 


PKD domain 


1.7e-08 


41.7 


4 


407- 

495:502- 

591:596- 

685:690- 

782 


331 


REJ 


REJ domain 


0.99 


-314.6 


1 


327-806 


331 


fh3 


Fibronectin type III domain 


3.7 


-2.3 




408-486 


331 


Arthrodefensi 
n 


Arthropod defensin 


4.6 


4.0 


-j 


879-907 


333 


UbiA 


UbiA prenyl transferase family 


3.2e-56 


200.2 


1 


86-351 


338 


7rm_l 


7 transmembrane receptor 
(rhodopsin family) 


l.le-34 


128.7 


1 


40-289 


338 


Ell-Sor 


PTS system sorbose-specific iic 
component 


9.1 


-143.4 


1 


20-226 


339 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


l.le-34 


128.7 


i 


40-289 


339 


EH-Sor 


PTS system sorbose-speciflc iic 
component 


9.1 


-143.4 


i 


20-226 


340 


COesterase 


Carboxylesterase 


2.3e-133 


456.4 


i 


19-624 


341 


7tm 2 


7 transmembrane receptor 


2.3e-21 


84.4 




637-897 


341 


GPS 


Latrophilin/CL-l-like GPS 
domain 


2.7e-13 


57.6 


-j 


581-634 


341 


HRM 


Hormone receptor domain 


0.0085 


15.8 


i 


298-351 


341 


Me-amine- 
deh L 


Methylamine dehydrogenase, L 
chain 


4 


-30.1 


1 


190-321 


342 


7tm_J 


7 transmembrane receptor 
(rhodopsin family) 


3.4e-06 


25.9 




41-225 


342 


DUF32 


Domain of unknown function 
DUF32 


1.9 


-145.9 


i 


37-242 


342 


DUF40 


Domain of unknown function 
DUF40 


9.1 


-135.5 


1 


26-240 


344 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


2.2e-28 


107.8 




44-293 


344 


Abi 


CAAX amino terminal protease 
family 


5.4 


-25.4 




101-190 


345 


7tm 2 


7 transmembrane receptor 


3.3e-73 


256.6 




396-739 


345 


GPS 


Latrophilin/CL-l-like GPS 
domain 


3.1e-15 


64.0 


i 


345-394 


345 


metalthio 


Metallothionein 


1.7 


-4.1 




33-100 


345 


7tm 5 


7TM chemoreceptor 


1.7 


-157.4 


i 


392-650 


345 


CbiM 


CbiM 


2.1 


-83.3 


1 


497-654 


345 


DUF26 


Domain of unknown function 
DUF26 


2.9 


-12.6 


1 


64-109 


345 


cytochrome J>_ 


Cytochrome b(C- 
terminal j/Do/petu 


4 


-28.5 


i 


369-471 


345 


TIL 


Trypsin Inhibitor like cysteine 
richd 


9.7 


-15.4 




23-74 


346 


7tm 2 


7 transmembrane receptor 


3.3e-73 


256.6 




300-643 


346 


GPS 


Latrophilin/CL-l-like GPS 
domain j 


3.1e-15 


64.0 




249-298 


346 


7tm 5 


7TM chemoreceptor 


1.7 


-157.4 




296-554 


346 


CbiM 


CbiM 


2.1 


-83.3 




401-558 


346 


cytochrome b 
C 


Cytochrome b(C- 
terminal)/b6/petD 


4 


-28.5 




273-375 
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Repeats 


Position 


351 


>8 


Immunoglobulin domain 


0.00033 


27.4 


1 


72-150 


355 


LRR 


Leucine Rich Repeat 


4.6e-29 


110.0 


7 


49-72:73- 

96:97- 

120:121- 

144:146- 

169:170- 

193:194- 

217 


355 


fh3 


Fibronectin type III domain 


2.7e-08 


41.0 


1 


387-470 | 


355 


ig 


Immunoglobulin domain 


2.4e-07 


37.9 




278-336 


355 


LRRCT 


Leucine rich repeat C-terminal 
domain 


0.054 


17.5 


1 


218-262 


355 


LRRNT 


Leucine rich repeat N-terminal 
domain 


1 


12.9 




1M7 


356 


thiored 


Thioredoxin 


0.0088 


-10.1 




172-279 


357 


Reprolysin 


Reprolysin (M12B) family zinc 
metallo 


3.6e-93 


322.9 


-} 


211-409 


357 


Pep_M12B_pro 
pep 


Reprolysin family propeptide 


7.7e-43 


155.7 


1 


80-196 


357 


disintegrin 


Disintegrin 


2.2e-25 


97.8 




426-501 


357 


Adeno E3 CR 
2 


Adenovirus E3 region protein 
CR2 


5.1 


-2.5 


1 


698-738 1 


357 


EB 


EB module 


9.3 


-12.3 


1 


633-682 


358 


ig 


Immunoglobulin domain 


6.7e-07 


36.4 


2 


115- 

168:208- 
265 


359 


ig 


Immunoglobulin domain 


6.7e-07 


36.4 


2 


109- 

162:202- 
259 


362 i 


ig 


Immunoglobulin domain 


6.9e-07 


36.3 


2 


47- 

139:179- 
274 


365 


Folate carrier 


Reduced folate carrier 


3.8e-145 


495.6 




10-441 


365 


ion trans 


Ion transport protein 


8.3 


-13.4 


1 


85-337 


365 


Nucleoside_tra 
n 


Nucleoside transporter 


8.7 


-163.1 


1 


113-367 


365 


FecCD 


FecCD transport family 


9.4 


-220.8 


1 


274-457 


365 


sugar tr 


Sugar (and other) transporter 


9.7 


-198.0 




11-459 


368 


p450 


Cytochrome P450 


4.6e-19 


76.8 


| 


60-379 


370 


gla 


Vitamin K-dependent 
carboxylation/gamma-carb 


3.5e-15 


63.9 


1 


57-98 


371 


actin 


Actin 


1.6e-12 


55.0 




8-371 


372 


DUFI40 


Domain of unknown function 
DUF140 


5.9 


-162.8 


1 


1-204 


375 


TruB_N 


TruB family pseudouridylate 

CVTlt hi) CP 1 

djrlllliactC 


6.6e-69 


242.3 


1 


107-247 


375 


PUA 


PUA domain 


5e-18 


73.3 




339-414 


376 


TruBN 


TruB family pseudouridylate 
synthase 


6.6e-69 


242.3 




78-218 


376 


PUA 


PUA domain 


1.8e-25 


98.0 




266-341 


377 ! 


abhydrolase 


alpha/beta hydrolase fold 


0.015 


15.7 




80-270 


377 


Lipase 3 


Lipase (class 3) 


0.6 


-26.8 




68-184 


377 | 


Thioesterase 


Thioesterase domain 


1.9 


-44.1 




53-270 


378 


abhydrolase 


alpha/beta hydrolase fold 


l.le-10 


49.0 




80-326 


378 


Lipase 3 


Lipase (class 3) 


0.98 


-29.1 




68-198 
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E-value 
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— 


Position 


3/o 


Thioesterase 


Thioesterase domain 


i a 

1.0 


A 1 A 


— 


d j-zy / 


382 


ITL 


Tubulin-tyrosine ligase family 


l.je-izu 


A 1 1 O 

hi 3.y 


J 


400- /OH 


383 


uy_con 


Ubiquitin-conjugating enzyme 


A 1A 

4.ze-iu 


A1 A 


J 


"iAQ^k 1 O 

1 Z 


384 


sugar tr 


Sugar (and other) transporter 


1.2 


1 "71 "7 
-1/1./ 


J 


M-4/1 


1 Oil 

384 


voltage CLC 


Voltage gated chloride channel 


9.2 


")A1 A 

-243.0 


J 


yz-3y3 


700 

388 


Amino_oxidase 


Flavin containing amine 
oxidoreductase 


1 .ye-oy 


1AA 1 


1 


Z3-4y/ 


389 


DENN 


DENN (AEX-3) domain 


2.1e-87 


303.8 


i 


202-390 


1 Oft 

389 


nt fx I 

RUN 


t>t TXT 

RUN domain 


8e-M 


loZ.3 




OA1 OA£> 


389 


uDENN 


uDENN domain 


1.2e-32 


121.9 


i 


4-138 


389 


dDENN 


dDENN domain 


3.2e-31 


117.1 




C 1 1 COO 

512-555 j 


389 


PLAT 


PLAT/LH2 domain 


7.4e-17 


69.4 


i 


957-1059 


390 


•Rhomboid 


Rhomboid family 


4.7e-05 


30.2 




59-214 


390 


U1M 


Ubiquitin interaction motif 


2.1 


14.6 




268-285 


392 


Occhidin 


Occludin/ELL family 


l.le-05 


-92.9 


i 


183-550 


392 


7tai 5 


7TM chemoreceptor 


4 


-164.0 


! 


184-451 


393 


DUF6 


Integral membrane protein 
DUF6 


0.042 


15.4 




80-186 


393 


Nramp 


Natural resistance-associated 
macrophage pro 


5.3 


-290.4 


1 


123-381 


393 


EII-GUT 


PTS system enzyme II sorbitol- 
specific facto 


5.8 


-135.7 


1 


192-300 


395 


Patched 


Patched family 


3.2e-105 


363.0 




166-965 


395 


Srg 


C.elegans Srg family integral 
membrane prote 


2.7 


-213.3 


-j — 


214-464 


395 


UPF0132 


Uncharacterised protein family 
(UPF0132) 


4.8 


-39.8 


[i — 


402-494 


395 


Sec62 


Translocation protein Sec62 


5.6 


-132.6 


-j — 


311-502 


396 


zf-C4 


Zinc finger, C4 type (two 
domains) 


1.8e-42 


154.5 




100-174 


396 


hormonerec 


Ligand-binding domain of 
nuclear hormone 


7e-17 


69.5 


1 


281-441 


398 


NaHExchang 
er 


Sodium/hydrogen exchanger 
family 


9.9e-103 


354.7 


i 


62-478 


398 


ABC2_membra 
ne 


ABC-2 type transporter 


0.92 


-112.6 


i 


254-479 


398 


GntP_permease 


GntP family permease 


4.9 


-374.7 


i 


64-366 


398 


Transp_cytj>ur 


Permease for cytosine/purines, 
uracil 


5 


-194.9 


1 


50-427 


398 


ABC-3 


ABC 3 transport family 


7.8 


-194.6 




260-469 


398 


TrkH 


Sodium transport protein 


7.9 


-214.7 




12-411 


398 


DUF6 


Integral membrane protein 
DUF6 


8 


-23.3 




338-462 


398 


ERlumenrece 
P* 


ER lumen protein retaining 
receptor 


8.7 


-158.2 


1 


274-435 


399 


DUF284 


Eukaryotic protein of unknown 
function, DUF2 


1.5e-114 


394.0 




68-309 


402 


F-box 


F-box domain 


0.0091 


22.6 




8-55 


404 


PAP2 


PAP2 superfamily 


1.4e-30 


115.0 




129-283 ! 


406 


Patched 


Patched family 


5.8e-17 


-4.9 




1-756 


406 


oxidoredql 


NADH- 

Ubiquinone/plastoquinone 
(complex I) 


0.55 


-146.0 




77-319 


406 


UPF0118 


Domain of unknown function 


9.3 


-133.5 


l 


377-719 
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Repeats 


Position 






DUF20 










411 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


7.1e-38 


139.3 


1 


41-290 


411 


7tm 5 


7TM chemoreceptor 


6.7 


-168.1 


1 


16-258 


412 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


1.3e-85 


297.9 


1 


43-297 


412 


7tm 5 


7TM chemoreceptor 


1.8 


-157.8 


1 


51-305 


413 


PHD 


PHD-flnger 


0.21 


-3.5 


1 


150-199 


413 


zf-MIZ 


MIZ zinc finger 


3.9 


-18.2 


j 


150-200 


415 


E1-E2 ATPase 


E1-E2 ATPase 


1.7e-U3 


390.5 




223-454 


415 


Cation ATPase 
C 


Cation transporting ATPase, C- 
terminu 


1.7e-69 


244.3 


1 


921-1099 


415 


Cation ATPase 
N 


Cation transporter/ATPase, N- 
terminus 


4.2e-42 


153.3 


1 


121-204 


415 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


3.7e-15 


63.8 




458-825 


415 


7tm 5 


7TM chemoreceptor 


9.4 


-170.7 




170-438 


416 


MAPEG 


MAPEG family 


2.1 


-21.7 




98-183 


416 


Cation ATPase 
C 


Cation transporting ATPase, C- 
terminu 


5.6 


-47.5 


1 


81-221 


418 


HC03_cotrans 
P 


HC03- transporter family 


0 


1024.4 


i 


84-853 


418 


xan_ur_pennea 
se 


Permease family 


0.9 


-176.0 




375-836 


421 


Kelch 


Kelch motif 


3.9e-49 


176.7 


5 


258- 

308:310- 

355:357- 

417:419- 

471:473- 

519 


421 


BTB 


BTB/POZ domain 


0.88 


-10.1 


1 


2-70 


422 


WD40 


WD domain, G-beta repeat 


1.6e-20 


81.6 


4 


16-56:62- 
98:162- 
199:313- 
349 


422 


aminotran 1 2 


Aminotransferase class I and II 


0.0091 


-46.1 




391-597 


422 


Cys Met Meta 
PP 


Cys/Met metabolism PLP- 
dependent enzy 


9.6 


-318.8 




371-600 


423 


ribonuc_red_s 
m 


Ribonucleotide reductase, small 
chain 


5.6 


-142.1 


"j 


989-1265 


424 


DUF87 


Domain of unknown function 
DUF87 


3.9 


-134.3 


1 


48-354 


427 


DUF6 


Integra] membrane protein 
DUF6 


3.8 


-17.8 


1 


143-271 


427 


Frizzled 


Frizzled/Smoothened family 
membrane regio 


7.2 


-246.3 


1 


79-280 


427 


oxidoredql 


NADH- 

Ubiquinone/plastoquinone 
(complex I) 


9 


-170.9 




70-270 


428 


DUF6 


Integral membrane protein 
DUF6 


3.8 


-17.8 




143-271 


428 


Frizzled 


Frizzled/Smoothened family 
membrane regio 


12 


-246.3 




79-280 


428 


oxidored_ql 


NADH- 

Ubiquinone/pjastoquinone 


9 


-170.9 




70-270 
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(complex I) 










430 


pkinase 


Protein kinase domain 


5.6e-33 


123.0 




9-273 


432 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.0015 


24.7 




13-59 


432 


FYVE 


FYVE zinc finger 


9.5 


-26.0 




10-65 


434 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin 
family 


1.7e-39 


144.7 


; 


89-266 


434 


Grpl Fun34 Y 
aaH 


GPRl/FUN34/yaaH family 


5.9 


-120.3 




71-240 


435 


DnaJ_CXXCX 
GXG 


DnaJ central domain (4 repeats) 


3.5 


-46.2 




37-92 


437 


AT hook 


AT hook motif 


3.1 


10.6 


1 


713-725 


438 


MORN 

• 


MORN repeat 


1.4e-34 


128.3 


7 


15-37:39- 

60:61- 

81:107- 

129:130- 

152:288- 

310:311- 

333 


443 


PAP2 


PAP2 superfamily 


2.9e-29 


110.7 


1 


82-230 


448 


hormonerec 


Ligand-binding domain of 
nuclear hormone 


3.6e-39 


143.6 


I 


148-329 j 


448 


zf-C4 


Zinc finger, C4 type (two 
domains) 


3.3e-25 


97.2 


1 


9-66 


449 


cadherin 


Cadherin domain 


3.2e-37 


137.1 


4 


15- 

108:127- 
227:241- 
331:342- 
441 


A A t\ 

449 


SMP-30 


Senescence marker protein-30 
(SMP-30) 


9 


-180.9 


1 


223-467 


450 


spectrin 


Spectrin repeat 


0.86 


-8.7 


1 


97-203 


451 


zf-CXXC 


CXXC zinc finger 


2.1e-06 


34.7 


1 


131-172 


452 


HLH 


Hehx-loop-helix DNA-bindmg 
domain 


4.4e-09 


43.6 


1 


106-165 


453 


TP2 


Nuclear transition protein 2 


o o 

8.8 


-60.2 


I 


200-335 


A CO 

458 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


2.1e-05 


7.3 


1 


41-233 


A £.1 

463 


TUDOR 


Tudor domain 


6.6e-13 


56.3 


1 


13-134 


464 


Reprolysin 


Reprolysin (M12B) family zinc 
me tallo 


3e-88 


306.6 


1 


146-345 


A £.A 

464 


Pep_M12B_pro 
pep 


Reprolysin family propeptide 


1 .3e-3 1 


118.4 


1 


16-134 


A£.A 

464 


disintegrin 


Disintegrin 


2.5e-23 


90.9 i 


1 


362-437 


464 


EGF 


EGF-like domain 


0.65 


16.5 


1 


589-616 


AfxA 


metalthio 


Metallothionein 


o. / 


-12.3 


1 


302-428 


466 


SAC3 GANP 


SAC3/GANP family 


8.8e-77 


268.5 


1 


159-358 


468 


HEAT 


HEAT repeat 


0.0012 


25.5 


1 


546-584 


469 


DUF6 


Integra] membrane protein 
DUF6 


0.00028 


27.7 


2 


50- 

179:197- 
327 


469 


PhaG MnhG 
YufB 


Na+/H+ antiporter subunit 


2 


-50.3 


1 


66-168 


469 


DUF7 


Integral membrane protein 
DUF7 


3.9 


-34.6 


1 


227-318 
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469 


Competence 


Competence protein 


7.5 


-104.9 




93-330 


471 


DENN 


DENN (AEX-3) domain 


4.9e-87 


302.6 


i 


57-241 


471 


dDENN 


dDENN domain 


1 .4e-25 


98.4 




286-353 


471 


uDENN 


uDENN domain 


0.0068 


-0.5 


i 


1-50 


474 


Synaptophysin 


Synaptophysin / synaptoporin 


4.2e-38 


140.0 




25-241 


476 


zf-MYND 


MYND finger 


3e-05 


30.9 




296-335 


476 


SET 


SET domain 


2.3 


-50.9 


■ 


450-577 


476 


Antifreeze 


Antifreeze-like domain 


8.4 


-10.3 




246-295 


477 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


2.4e-30 


114.2 




44-293 


481 


HC03_cotrans 
P 


HC03- transporter family 


0 


1072.8 




108-891 


481 


xan ur_permea 
se 


Permease family 


0.64 


-172.1 


1 


410-874 


482 


ank 


Ankyrin repeat 


9.3e-20 


79.1 


4 


172- 

207:219- 
251:266- 
299:345- 
377 


485 


LRRCT 


Leucine rich repeat Oterminal 
domain 


9.7e-09 


42.5 


1 


9-58 


485 


GPS 


I^trophilin/CL-l-like GPS 
domain 


0.0012 


25.4 


1 


519-571 


485 


7tm_2 


7 transmembrane receptor 
(Secretin family) 


0.0055 


-90.7 


1 


578-784 


485 ! 


ig 


Immunoglobulin domain 


0.0078 I 


22.8 


I 


79-148 


485 


HRM 


Hormone receptor domain 


0.069 


6.8 


1 


168-241 


486 


7tm 1 


7 transmembrane receptor 


2.9e-38 


140.6 


1 


32-278 


486 


7tm 5 


7TM chemoreceptor 


0.23 


-141.7 


1 


55-268 


486 


V1R 


Vomeronasal organ pheromone 
receptor fami 


0.4 


-145.6 


1 


42-291 


486 


oxidored_ql 


NADH- 

Ubiquinone/plastoquinone 
(complex I) 


4.1 


-164.0 


1 


20-268 


486 


UPF0032 


MttB family UPF0032 


7.3 


-94.8 


1 


54-248 


490 


mitocarr 


Mitochondrial carrier protein 


6e-24 


93.0 


2 


61- 

152:155- 
232 


491 


7tmJ 


7 transmembrane receptor 
(rhodopsin family) 


5.3e-26 


99.8 


1 


41-289 


493 


LRR 


Leucine Rich Repeat 


1.2e-15 


65.5 


5 


95- 

118:119- 
142:143- 
166:167- 
190:191- 
214 


493 


LRRNT 


Leucine rich repeat N- terminal 
domain 


3e-08 


40.9 


1 


64-93 


493 


LRRCT 


Leucine rich repeat C-terminal 
domain 


7.8e-07 


36.1 


1 


224-277 


494 


Retrotrans gag 


Retrotransposon gag protein 


2 


-5.1 


1 


180-273 


495 


CDP- 

OH P transf 


CDP-alcohol 
phosphabdyltransferase 


5.8e-08 


39.9 


1 


94-242 


495 


Cons hypoth69 
8 


Conserved hypothetical protein 
698 


3 


-173.7 


1 


136-379 
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497 


oxidored_ql_C 


NADH-Ubiquinone 
oxidoreductase 


7.2 


tic t\ 
-00.0 


1 


27-276 


499 


RapGAP 


Rap/ran-GAP 


1.7e-21 


84.9 


1 


1335- 
1514 


500 


fii3 


Fibronectin type III domain 


l.le-12 


55.6 




47-130 


501 


hormonejrec 


Ligand-binding domain of 
nuclear hormone 


2e-45 


164.4 


i 


364-545 


501 


zf-C4 


Zinc finger, C4 type (two 
domains) 


1.4e-16 


68.5 


1 


269-316 


502 


7tm 5 


7TM chemo receptor 


4.3 


-164.6 


\ 


9-304 


503 


RhoGEF 


RhoGEF domain 


2.7e-33 


124.0 




320-502 


504 


£h3 


Fibronectin type HI domain 


1.5e-09 


45.1 




174- 

267:473- 
560 


505 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


1.7e-41 


151.3 


* 


83-332 


505 


7tm 5 


7TM chemoreceptor 


4.5 


-165.1 




89-327 


505 


DUF40 


Domain of unknown function 
DUF40 


4.8 


-130.6 


> 


79-274 


506 


PFEMP 


Plasmodium falciparum 
erythrocyte membrane p 


0.16 


-65.7 




919-1028 


507 


trypsin 


Trypsin 


2.6e-79 


276.9 


1 


218-559 


507 


SRCR 


Scavenger receptor cysteine-rich 
domain 


6.2 


-22.5 


1 


120-207 


508 


PKD 


PKD domain 


2.6e-09 


44.4 


1 


641-732 


508 


BNR 


BNR/ Asp-box repeat 


le-06 


35.7 


5 


54- 

65:102- 

113:338- 

349:415- 

426:457- 

468 


C Art 

509 


Clq 


Clq domain 


7.3e-32 


119.3 




211-335 


509 


Collagen 


Collagen triple helix repeat (20 
copies) 


3.8e-06 


33.8 




144-203 




Lysis col 


Lysis protein 


9.3 


-10.9 




95-130 


513 


7tm 1 


7 transmembrane receptor 


1.7e-10 


48.3 




43-294 


513 


Competence 


Competence protein 


6.8 


-104.0 




197-459 


513 


NaHantiporte 
r 


Na+/H+ antiporter family 


8.9 


-119.1 


■ 


126-404 


514 


7tm 5 


7TM chemoreceptor 


1 


-153.5 




164-454 


514 


sugar tr 


Sugar (and other) transporter 


2.8 


-182.4 




50-547 


515 


Peptidase C20 


Type IV leader peptidase family 


3.3 


-182.3 




99-278 


515 


MadM 


Malonate/sodium symporter 
MadM subunit 


4.7 


-20.6 




209-271 


516 


LRR 


Leucine Rich Repeat 


4.8e-31 


116.6 


8 


114- 

137:138- 
161:162- 
184:185- 
208:209- 
230:231- 
254:255- 
278:279- 
302 


516 


LRRNT 


Leucine rich repeat N-terminal 
domain 


0.00038 


27.2 


1 


24-55 
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516 


7tm 1 


7 transmembrane receptor 


0.0032 


-43.2 




434-683 


516 


Ell-Sor 


PTS system sorbose-specific iic 
compon 


5.8 


-140.2 


i 


427-629 


516 


Cytidylyltrans 


Phosphatidate 
cytidylyltransferase 


7.1 


-89.9 


1 


515-612 


516 


oxidoredql 


NADH- 

Ubiquinone/plastoquinone 


9.7 


-171.5 


1 


470-680 


516 


MeiC 


MerC mercury resistance 
protein 


9.8 


-87.5 


1 


529-627 


519 


7tm 2 


7 transmembrane receptor 


2.3e-21 


84.4 


i 


504-764 


519 


GPS 


Latrophilin/CL-l-like GPS 
domain 


2.7e-13 


57.6 


1 


448-501 


519 


HRM 


Hormone receptor domain 


0.0085 


15.8 




165-218 


519 


Me-arnine- 
deh L 


Methylamine dehydrogenase, L 
chain 


4 


-30.1 


I 


57-188 


521 


SNF 


Sodium: neurotransmitter 
symporter family 


4.3e-20 


7.1 


i 


61-289 


523 


SPRY 


SPRY domain 


6.1e-20 


79.7 


1 


153-284 


524 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


1.6e-52 


187.9 


1 


75-338 


524 


V1R 


Vomeronasal organ pheromone 
receptor family 


7.7 


-169.0 


1 


82-351 


525 


DUF284 


Eukaryotic protein of unknown 
function, DUF2 


2.le-113 


390.1 


, 


53-350 


526 


7tmJ 


7 transmembrane receptor 
(rhodopsin family) 


0.037 


-67.9 


1 


71-379 


527 


Patched 


Patched family 


0.00021 


-419.9 


1 


1-484 


528 


PSS 


Phosphatidyl serine synthase 


7.3 


-242.7 


1 


115-277 


529 


Acyltransferase 


Acyltransferase 


0.27 


-15.8 




352-517 


531 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


0.0063 


-49.9 


J 


96-253 


532 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


8.6e-35 


129.0 


1 


62-311 


534 


7tm 2 


7 transmembrane receptor 


3.3e-73 


256.6 


1 


179-522 


A 

534 


GPS 


Latrophjiin/CL- 1 -like GPS 
domain 


2.8e-15 


64.2 




128-177 


534 


7tm 5 


7TM chemoreceptor 


1.7 


-157.4 


1 


1 75-433 


534 


CbiM 


CbiM 


2.1 


-83.3 




280-437 


534 


cytochrome b 
C 


Cytochrome b(C- 
terminal)/b6/petD 


4 


-28.5 


■ 


152-254 


535 


Rhomboid 


Rhomboid family 


8.5e-18 


72.6 




647-789 


535 


Competence 


Competence protein 


4.4 


-100.3 




640-849 


536 


Rhomboid 


Rhomboid family 


8.5e-18 


72.6 


4 — 


670-812 


536 


Competence 


Competence protein 


4.4 


-100.3 




663-872 


538 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


6.5e-34 


126.1 




41-290 


542 


SEA 


SEA domain 


5.1e-10 


46.7 




472-591 


542 


EGF 


EGF-like domain 


0.57 


16.7 




425- 

462:633- 

672 


542 


EB 


EB module 


4.8 


-9.1 




412-462 


542 


Bowman- 
Birk leg 


Bowman-Birk serine protease 
inhibitor 


7.2 


-18.4 




628-672 


542 


Keratin B2 


Keratin, high sulfur B2 protein 


8.8 


-83.0 




254-385 


543 


SPRY 


SPRY domain 


7.8e-17 


69.4 




347^68 
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543 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3.1e-ll 


50.7 


1 


16-56 


543 


zf-B box 


B-box zinc finger 


5.7e-05 


29.9 


1 


92-133 


544 


Ribosomal_S26 
e 


Ribosomal protein S26e 


2.1e-20 


81.2 


1 


1-110 


544 


rnaseA 


Pancreatic ribonuclease 


1.3e-07 


32.0 


1 


106-232 


545 


Patched 


Patched family 


0.33 


-525.2 


1 


37-846 


545 


oxidored_q3 


NADH- 

ubiquinone/plastoquinone 
oxidoreduct 


4.3 


-79.9 


1 


201-368 


545 


oxidoredql 


NADH- 

Ubiquinone/plastoquinone 
fcomDlex Yi 

l VVllU/IVn MM 


9.7 


-171.5 


1 


663-851 


545 


Keratin B2 


Keratin, high sulfur B2 protein 


10 


-83.9 


1 


11-141 


546 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


0.028 


-65.2 


1 


47-249 


547 


fh3 


Fibronectin type III domain 


4.1e-102 


352.6 


6 


947- 

1034:104 
6- 

1138:115 
0- 

1239:125 
1- 

1337:144 
4- 

1527:154 
1-1623 














547 




Immunoglobulin domain 


1.8e-87 


304.0 


9 


199- 

260:300- 

356:389- 

448:482- 

547:579- 

637:670- 

731:764- 

829:863- 

929:1364- 

1425 


548 


gla 


Vitamin K-dependent 
carboxylation/garnma-carb 


3.7e-15 


63.8 




24-65 


550 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


l.le-39 


145.3 




41-290 


550 


DUF40 


Domain of unknown function 
DUF40 


2 


-123.7 


1 


39-229 


551 


HC03_cotrans 
P 


HC03- transporter family 


0 


1723.0 




146-959 


CO 


xan_ur_permea 

se 


Permease family 


3.3 


-190.7 




ATI t\A 1 

477-94 1 


551 


Plant vir_prot 


Plant virus coat protein 


9.3 


-51.7 




772-865 


551 


DENN 


DENN (AEX-3) domain 


9.5 


-71.3 




593-719 


552 


DUF6 


Integral membrane protein 
DUF6 


0.092 


9.6 




68-174 


552 


DUF250 


Domain of unknown function, 
DUF250 


2.8 


-98.0 




180-351 


552 


oxidored_q3 


NADH- 

ubiquinone/plastoquinone 


5.9 


-82.1 




81-236 
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oxidoreduct 










552 


7tm 5 


7TM chemoreceptor 


92 


-170.6 


1 


54-338 
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ACTIVATED PROTEIN C; 
CHAIN: C, L; D-PHE-PRO- 
MAI; CHAIN: P; 


COAGULATION FACTOR 
EGF-LIKE MODULE OF 
BLOOD COAGULATION 
FACTOR X (N-TERMINAL, 
1APO 3 APO FORM) (NMR, 13 
STRUCTURES) 1 APO 4 




BLOOD COAGULATION 
FACTOR XA; CHAIN: L, C; 


FIBRILLIN; CHAIN: NULL; 


I; DES-GLA FACTOR VIIA 
(LIGHT CHAIN); CHAIN: L, 
M; (DPN)-PHE-ARG; CHAIN: 
C, D; PEPTIDE E-76; CHAIN: 
X, Y; 
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COMPLEX (BLOOD 
COAGULATION/INHIBITOR) 1 
AUTOPROTHROMBIN IIA; 
HYDROLASE, SERINE 
PROTEINASE), PLASMA CALCIUM 
BINDING, 2 GLYCOPROTEIN, 






BLOOD COAGULATION FACTOR 
STUART FACTOR; BLOOD 
COAGULATION FACTOR, SERINE 
PROTEINASE, EPIDERMAL 2 
GROWTH FACTOR LIKE DOMAIN 


MATRIX PROTEIN 
EXTRACELLULAR MATRIX, 
CALCIUM-BINDING, 
GLYCOPROTEIN, 2 REPEAT, 
SIGNAL, MULTIGENE FAMILY, 
DISEASE MUTATION, 3 EGF-LIKE 
DOMAIN, HUMAN FIBRILLIN- 1 
FRAGMENT, MATRIX PROTEIN 


COMPLEX 
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XY; 


DES-GLA FACTOR VIIA 
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X,Y; 


BLOOD COAGULATION 
FACTOR VIIA; CHAIN: L, H; 
SOLUBLE TISSUE FACTOR; 
CHAIN: T, U; D-PHE-PHE- 
ARG- 

CHLOROMETHYLKETONE 
(DFFRCMK) WITH CHAIN: C; 




Compound 


MATRIX PROTEIN 
EXTRACELLULAR MATRIX, 
CALCIUM-BINDING, 
GLYCOPROTEIN, 2 REPEAT, 
SIGNAL, MULTIGENE FAMILY, 
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FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E, F, G, H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E, F, G, H; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, C, 
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15-34; 1045 147-161; 1944 
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43-65; 1 330 104-H9;l947 
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18-42; 2872 143-158; 1292 
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21-48; 787 73-92; 1024 95-1 14; 1804 167-182; 1499 
210-225; 997 256-275; 1133 314-345; 939 389- 
405; 1337 
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1 6-32; 1 965 40-59; 506 66-86; 209 1 1 1 1 - 1 26; 1 647 
155-172; 669 199-217; 1521 240-255; 1130 302- • 
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Domains 
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A t\ O A /\T A A^*\ A A A % r A C f\ A A* n * \ *~ ~* t~ r r\ at*+0%. a *t\ 

408; 1074 423-444; 1905 450-468; 1163 552-572; 540 


323 


10 


11-38; 1993 50-65; 859 106-128; 1632 117-140; 870 
164-184; 1886 194-209; 1335 299-324; 1463 339- 

33Z, V3U4 13-431; 533 400-4oi; 130O 


324 


1 


35-55; 694 


IOC 


* 

1 


22-43; 2636 


326 


1 


152-168; 610 


327 


4 


22-38; 3134 65-80; 1300 512-531; 2076 542-555; 746 


328 


3 


22-38; 3134 65-80; 1300 493-507; 936 


329 


3 


22-38; 3 134 65-80; 1313 5 12-53 1 ; 2076 


330 


4 


27-48; 1 144 69-92; 2697 1 19-134; 1835 160-182; 552 


331 


3 


3 1 -47; 1 577 652-667; 592 930-952; 3003 


332 


1 


148-169; 2982 


333 


7 


83-99; 1049 1 10-125; 1 190 182-198; 1 150 206-222; 1406 
232-246; 953 278-295; 1834 338-353; 1407 


334 


5 


9-35; 1516 26-49; 2339 69-87; 1588 141-155; 2014 
154-180; 579 


335 


3 


58-73; 589 285-300; 1231 493-509; 2248 


336 


8 


285-303; 1 598 41 7-430; 866 549-566; 1 758 569-583; 995 
634-650; 1821 659-674; 1429 691-709; 2005 724- 

737; 825 


337 


1 


66-92; 508 


338 


7 


24-39; 2590 60-73; 600 91-119; 1337 148-163; 566 
196-214; 2187 236-259; 878 272-291; 1508 


339 


7 


24-39; 2590 60-73; 600 91-1 19; 1337 148-163; 566 
196-214; 2187 236-259; 878 272-291; 1508 


340 


5 


18-33; 955 222-237; 670 282-299; 1484 310-325; 786 
710-731; 2486 


341 




AA1 AHA* 0£ </IQ C£1. 0>4O CA£. £.£.£.. 1TAA /OA *7/V% « aai 

44/-404, o/o 34o-5o3; 848 646-666; 2709 680-702; 1087 
712-727; 1843 752-770; 1 193 799-818; 2230 844- 
860; 1402 877-893; 1767 


342 


5 


25-51; 2632 61-75; 1133 92-120; 1945 141-158; 1186 
177-196; 1468 


343 


5 


41-59; 1627 54-85; 2078 141-162; 1510 178-199; 2300 
241-266; 1378 


344 


7 


28-52; 2109 64-85; 1007 95-123; 1859 147-161; 875 j 
200-219; 1807 247-263; 1555 276-295; 1639 ! 


345 


11 


91-109; 760 245-262:900 405-424; 2528 436-454; 1166 | 
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| 460-478; 1710 514-530; 1043 551-573; 2733 597- 






615; 1300 


625-644; 1509 


688-707; 1446 


773-790; 617 


1AH 

346 




149-166; 900 


309-328; 2528 


340-358; 1166 


364-382; 1710 






418-434; 1043 455-477; 2733 501-519; 1300 529- 






548; 1509 


592-611; 1446 


677-694; 617 




3«#/ 




38-54; 1710 


64-80; 1230 


150-169; 1096 


177-189; 660 






205-220; 1089 247-259; 583 294-3 11; 1199 




1 
1 


25-44; 1754 


34V 


A 

4 


61-78; 1267 


92-107; 1758 


96-132; 910 


125-145; 1211 


33U 


1 

I 


63-81; 2993 


J3 1 


1 
1 


21-37; 3067 


33z 


1 

I 


33-49; 829 


333 


1 


14-32; 1792 


3 34 


1 

1 


53-72; 1987 


333 


1 

1 


501-522; 2686 


356 


*> 

2 


235-254; 582 


307-322; 1905 






357 


3 


305-324; 989 


359-385; 512 


704-723; 3256 




358 


1 


20-39; 1897 


359 


1 


20-39; 1897 


360 


1 


21-36; 3076 


361 


2 


13-32; 2338 


110-126; 621 






362 


1 


342-363; 3126 


363 


4 


25-43; 2055 


148-164; 770 


232-258; 718 


270-283; 1272 


364 


6 


43-59; 1008 


80-95; 798 


130-149; 886 


157-175; 1133 






191-212; 1337 226-250; 1425 




365 


10 


58-74; 1806 


81-103; 1546 


115-127; 710 


174-189; 1420 






278-299; 1477 321-337; 1 182 347-363; 1923 383- 






398; 1258 


403-426; 1703 


439^54; 1202 




3oo 


-> 
3 


22-52; 1371 


65-89; 1862 


100-121; 994 




367 


• 
I 


217-236; 652 


3oo 


2 


21-36; 2696 


95-110; 1111 






Joy 


c 
3 


576-592; 578 


747-762; 2335 


764-786; 1265 


804-825; 1715 






856-871; 1373 




3/U 


1 


120-140; 3089 \ 




3 


100-115; 939 


284-302; 707 


332-347; 933 




IT) 


*7 
/ 


47-64; 1640 


87-101; 700 


119-134; 1949 


143-159; 507 






184-199; 593 208-223; 744 456-477; 2177 


lit 
3 /3 


7 


163-175; 1638 


182-207; 1865 






11A 

J/4 


1 


32-51; 3413 


tl^ 

J/3 


3 


225-243; 1004 


324-339; 1291 


386-402; 1266 




3 /O 


*> 


196-214; 1004 


313-329; 1173 






377 


2 


126-143; 1381 


149-161; 668 






378 


3 


126-143; 1381 


149-161; 668 


195-220; 807 




3 /y 


i 
i 


80-103; 3414 


380 


7 


2(M1;602 


52-71; 1552 


83-98; 1700 


103-120; 1370 






136-151;2709 162-178; 1788 193-211 


; 1280 


381 


3 


44-62; 2777 


65-80; 1045 


141-156; 1507 




382 


1 


92-112; 1518 


383 


2 


73-88; 605 


334-356; 1208 






384 


12 


54-69; 1830 


90-109; 2293 


118-133; 1498 


156-176; 884 






184-200; 1 166 232-251; 1806 282-297: 1680 320- 






335; 2405 


349-364; 1374 


377-401; 1798 


423-437; 1391 






444-463:2164 
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385 


5 


49-66; 2934 135-149; 610 177-197; 653 275-289; 698 
397-417; 1229 


386 


5 


49-66; 2934 1 66- 1 88; 504 1 90-208; 500 266-280; 698 
388^08; 1229 


387 


2 


35-61; 782 69-85; 2708 


388 


2 


13-32; 1026 364-383; 1294 


389 


5 


297-315; 565 321-336; 515 340-363; 626 934-954; 875 
1131-1147; 556 


390 


4 


27-43; 1 142 103-122; 1568 138-154; 868 174-204; 1058 


391 


3 


90-1 12; 638 127-145; 669 209-229; 733 


392 


5 


195-216; 2012 224-246; 640 258-279; 2594 294-313; 1 189 
342-362; 2675 


393 


9 


68-88; 2263 115-130; 1131 142-162; 2103 172-187; 986 
212-229; 2963 236-251; 1 166 274-291; 2044 31 1- 
326; 1229 337-357; 2709 


394 


1 


126-141; 896 


395 


14 


134-159; 1969 296-312; 1030 394-418; 2134 427-440; 1532 
432^58; 2248 452-469; 1111 500-518; 1407 536- 

549; 1051 616-633; 2001 817-832; 1658 841-858; 2487 
866-889; 943 912-934; 1900 940-957; 1433 


396 


2 


3 1 1 -344; 667 373-390; 788 i 


397 


1 


204-228; 2681 


398 


11 


61-80; 3083 91-107; 866 120-142; 886 154-169; 1501 
196-208; 865 267-286; 1 159 315-331; 2009 357- 
375; 1205 377-404; 2067 416-433; 913 447-463; 2180 


399 


2 


53-72; 2827 291-307; 809 


400 


2 


28-59; 982 54-69; 843 


401 


1 


188-207; 2756 


402 


2 


120-138; 631 196-211; 534 


403 


2 


64-86; 2717 120-136; 1251 


404 


6 


21-42; 555 76-100; 1949 130-150; 1051 204-219; 943 
232-248; 1740 260-278; 1996 


405 


8 


84-101; 750 135-154; 1635 162-178; 1545 187-204; 1038 
21 1-227; 2064 232-245; 1277 265-286; 1440 298- 
313; 1011 


406 


10 


167-182; 1236 192-213; 2175 202-237; 869 270-284; 1296 
296-316; 1177 309-327; 1613 400-412; 1434 597- 
614; 1965 624-660; 681 722-744; 2309 


407 


1 


A C f> Iff 1 

45-67; 3251 


408 


3 


53-83; 1832 107-121; 1361 128-151; 1826 


409 


1 


165-186; 1496 


410 


2 


328-350; 8 1 9 433-448; 634 


411 


7 


26-48; 2329 61-83; 815 95-120; 2154 143-159; 947 
205-222; 1700 237-260; 1060 270-292; 1172 




0 


/J-o/, llo4 Ivlo 145-loO;20Uo lyo-213;2o24 
235-256; 1873 281-300; 1350 


413 


2 


226-245; 2251 263-287; 800 


414 


4 


48-64; 1636 92-1 10; 1288 139-157; 930 171-192; 2385 | 


415 


10 


64-84; 854 188-201; 2590 218-237; 1364 386-401; 2666 
405-425; 1179 874-895; 1854 944-961; 1011 1000- 
1022; 1158 1040-1065; 894 1072-1088; 1850 


416 


4 


105-120; 2238 127-148; 1679 167-183; 2605 202-217; 1098 


417 


2 


49-64; 631 159-173:822 


418 


13 


241-255; 643 382-400; 1292 413-428; 1275 433-448; 852 



WO 03/025148 



PCT/US02/29964 



290 
Table 8 



CPA 


IN umber of 


For Each Transmembrane Domain, its Transmembrane Domain 


ill 


Transmembrane 


Position in SEQ ID NO: and its TM Pred Score 




Domains 














463-485; 1608 491-509; 732 589-605; 1660 630- 






645; 1543 


679-691; 1481 


720-735; 2038 


775-794; 1386 






801-817; 1752 849-864; 1553 




A 1 O 

'fly 


i 


154-172; 1020 


185-200; 629 


231-251; 1947 






c 

J 


34-50; 668 


70-85; 566 


264-282; 1020 


295-310; 629 






341-361; 1947 








2 


18-34; 530 


52-73; 703 






422 


3 


208-226; 725 


542-558; 567 


570-599; 943 




423 


8 


56-71; 578 


211-228; 1481 


328-346; 644 


454-473; 731 






587-601; 587 699-714; 553 1039-1055; 612 1489- 






1518; 771 








424 


1 


411-432; 2031 


425 


1 


51-68; 2943 


426 


1 


106-120; 2492 


427 


9 


42-57; 1250 


81-93; 1131 


95-111; 1306 


103-139; 901 






131-148; 1307 160-178; 1366 199-220; 1093 256- 






276; 1647 


311-326; 1736 








in 


42-57; 1250 


81-93; 1131 


95-111; 1306 


103-139; 901 






131-148; 1307 160-178; 1366 199-220; 1093 256- 






276; 1647 


314-332; 902 


368-384; 990 




429 


1 
i 


85-101; 1852 


410 


j 


198-216; 617 


389-404; 1219 


429-445; 1499 




411 
t J i 


1 


42-60; 2634 


417 


1 


215-230; 2143 






29-52; 2263 


62-82; 1557 


94-1 13; 2561 




414 


A 
*t 


96-112; 1641 


167-187; 2265 


202-224; 1612 


257-272; 2465 


41S 


1 


94-114; 2794 


41rt 


JL | 


73-92; 2179 


123-137; 779 






417 


1 

1 


271-292; 2993 


41R 


1 
1 


727-744; 2924 


410 1 


1 
1 


78-102; 2634 


440 


4 


90-110; 536 


114-131; 907 


183-195; 654 


268-291; 977 


441 


A 
*t 


90-110; 536 


114-131; 907 


183-195; 654 


268-291; 977 


44? 


4 


90-110; 536 


114-131;907 


183-195; 654 


268-291; 977 | 


441 


C 

J 


53-69; 2297 


83-98; 1058 


145-163; 1504 


179-194; 1353 






206-222; 2021 






444 


3 


78-98; 2028 


134-150; 1060 


224-243; 1701 




445 


4 


17-42; 706 


53-70; 1592 


97-112; 1041 


142-160; 2123 


446 


4 


198-214; 755 


274-289; 868 


306-321; 1260 


330-345; 737 


447 


1 


46-64; 1815 


448 


1 


129-154; 569 


449 


I 


468^89; 2129 


450 


I 


354-373; 3038 


451 


2 


64-79; 726 


73-97; 888 






452 


3 


151-166; 645 


186-208; 1300 


255-270; 508 




453 


3 


82-95; 530 


112-129; 1374 


1470-1491; 3847 




454 


2 


30-43; 2002 


302-320; 1525 






455 


2 


84-96; 576 


892-911; 2528 






456 


1 


28^8; 1700 


457 


1 


77-103; 2678 


458 


5 


25-50; 2582 


61-82; 1050 


92-120; 827 


140-155; 831 






199-214; 1366 




459 


7 


33-50; 2479 


58-73; 1393 


94-115; 882 


144-162; 671 
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2 14-231; 2323 25o-3Uy, IjV3 3/9-398; 2767 


460 


2 


39-58; 1574 90-107; 2845 


461 


2 


166-183; 1505 206-228; 2412 


462 


2 


103-1 18; 554 158-176; 1691 


463 


4 


155-170; 1480 316-331; 707 340-357; 1159 368-381; 609 


464 


2 


V* 1/\ 1 J A r '\t\ S F ft *t ft 1 

63-79; 1054 638-658; 2381 


A tZ C 

465 


1 


94-109; 1151 


466 


3 


340-355; 673 386-400; 599 435-45 1 ; 1 027 


467 


2 


40-55; 884 74-88; 904 


468 


3 


63-87; 668 134-150; 782 165-182; 1034 


469 


10 


49-66; 1360 79-94; 1389 111-124; 917 138-153; 1267 
165-179; 890 182-202; 532 229-243; 898 254- 
271; 1978 270-288; 1076 309-325; 1735 


470 


3 


107-122; 720 141-162; 1315 193-208; 759 


471 


2 


146-161; 510 194-221; 1018 


472 


3 


16-32; 1307 69-83; 1789 88-114; 1279 


473 


4 


16-32; 1307 69-83; 1789 88-1 14; 1279 129-154; 1 198 


474 


4 


38-54; 1155 103-121;2670 134-148; 1558 195-215; 1883 


475 


5 


90-1 12; 638 127-145; 669 209-229; 749 313-331; 644 
406-422; 904 


476 


2 


337-361; 1379 527-543; 559 


477 


6 


28-43; 1439 94-123; 768 143-157; 1354 200-222; 2716 
240-263; 1191 273-295; 1338 


478 


4 


71-88;2706 116-137; 867 136-153; 1128 171-195; 863 


479 


4 


47-59; 1552 63-86; 2366 107-124; 1545 143-170; 2265 


480 


4 


27-60; 710 83-101;931 116-152;668 603-627; 1141 


481 


13 


265-279; 643 417-435; 1292 448-463; 1319 468-483; 852 
498-520; 1608 526-544; 732 627-643; 1660 668- 

683; 1543 717-729; 1481 758-773; 2038 813-832; 1386 
839-855; 1752 887-902; 1553 


482 


5 


37-50; 569 445-463; 2049 489-513; 1074 529-549; 2945 
552-570; 1394 


483 


5 


37-53; 1814 71-86; 1511 93-108; 1516 121-136; 1562 
160-175; 2012 


484 


1 


« ft*% « • ft m sv^j* 

103-118; 1952 


485 


6 


121-139; 864 584-605; 2969 619-635; 1436 649-667; 1359 
699-719; 1257 746-762; 1819 


486 


7 


17-40; 2341 55-70; 1212 90-111; 1353 132-152; 1570 
185-203; 1862 221-237; 1592 258-281; 755 


487 


1 


73-92; 1951 


488 


2 


65-80; 2366 89- 1 02; 1 530 


490 


3 


62-76; 1511 91-109; 609 160-185;629 


491 


7 


25-40; 1285 58-76; 922 91-107; 584 142-164; 1715 

*>A A 11 O. \AQtL *%AA ">CA> HCT» 1*5TO **OA 1 A1 A 

200-218; I486 244-259; 2257 272-284; 1020 


492 


2 


159-174; 702 216-234; 2518 


493 


3 


20-35; 506 49-69; 984 333-352; 1717 


494 


1 


363-379; 1359 


495 


9 


52-71; 2689 88-103; 1366 153-165; 2603 188-205; 1124 
221-240; 2123 267-279; 1245 290-309; 1070 323- 
337; 1257 345-359; 844 


496 


2 


151-166; 1709 214-235; 1665 | 


497 


6 


102-119; 577 136-153; 1288 149-173; 551 194-212; 697 
262-281; 1364 304-316; 1698 
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498 


2 


136-151; 751 193-212; 2670 


499 


7 


181-196; 658 272-287; 862 740-753; 1 177 827-845; 521 
900-920; 77 1 926-94 1 ; 1 1 24 1 467- 1 492; 83 5 


C A A 

500 


2 


26-42; 553 172-188; 2514 


501 


1 


451-466; 826 


502 


6 


24-45; 1693 72-84; 881 95-1 14; 996 141-153; 878 
200-220; 2700 251-265; 1354 


50 J 


0 


726-747; 724 776-791; 985 806-828; 806 1019-1039; 680 

1ACO 1A01. if AC lift 1111. A1A 

1 058- 1 082; 605 1 1 1 1-1 131; 929 


504 


2 


71 Aft. * Art") CT> CAf *>fV n 

73-89; 1003 572-595; 2977 


505 


/ 


oo-91;2217 103-117; 1024 145-162; 1476 184-200; 1937 
239-258; 2428 287-302; 1125 312-334; 1293 


506 


I 4 


59-74; 784 41 1-426; 543 555-570; 1432 755-770; 543 


507 


5 


48-71; 2145 138-154; 508 233-257; 580 278-290; 793 
341-362; 1028 


508 


! 4 


22-41; 661 753-771; 682 866-881; 639 948-965; 1707 


509 


2 


93-109; 2922 246-262; 610 


510 


3 


45-71; 1224 97-1 19; 2200 105-128; 1270 


511 


1 1 


96-1 18; 2253 


512 


1 


213-228; 2903 


513 


12 


27-53; 2787 63-76; 997 108-129; 707 155-170; 1049 
201-221; 1704 247-263; 1270 274-296; 1442 385- 

397; 1137 437-452; 1414 510-529; 799 549-563; 1638 
576-596; 953 


514 


8 


200-215; 1460 271-289; 2381 361-378; 1369 396-416; 21 13 
440455; 1279 477-495; 1320 521-541; 1573 573- 
593; 2337 


515 


6 


94-lll;2450 116-137; 985 152-171;2459 188-203; 1343 
223-243; 4 668 254-269; 1 184 


516 


7 


422-439; 2505 460-482; 954 494-527; 1 524 546-562; 1 289 
588-606; 2147 631-648; 1264 667-686; 1796 


517 


2 


23-36; 582 40-73; 1069 


518 


11 


20-35; 1776 53-68; 1782 86-102; 1155 131-146; 1074 
164-179; 2382 442-459; 1328 495-510; 1765 527- 
542; 1214 547-562; 1720 590-617; 795 625-644; 1995 


519 


9 


314-331; 826 415-430; 848 513-533; 2709 547-569; 1087 
579-594; 1843 619-637; 1193 666-685; 2230 711- 

"TOT. 1 A A*> 1AA *7£lt\. 

727, 1402 744-760; 1767 


520 


2 


62-77; 645 116-133; 1910 


521 


5 


70-85; 975 101-119; 2374 140-158; 1457 228-244; 2107 
256-274; 1074 


522 


1 


01 f\ "1 _ ^ i^A 1 **% ^ 1 A^ Jl— « A /\ 4\ aT * a** f\ At AAA A A 4* « a\ a 

81-97; 2470 121-136; 1224 149-176; 1604 209-225; 1439 
267-286; 21 19 309-324; 1473 376-393; 1898 


523 


2 


34^»8; 680 160-175; 848 




7 


Jil- 0OO*7 0< 1 K* inn i>ii i ca* mm n< mo. ncc 
5y-5J»zyy/ 5O-lIo;i0J2 141-156; 1091 175-192; 1755 

228-249; 1807 281-297; 1698 318-341; 1040 


525 


3 ! 


34-52; 2348 155-170; 575 323-337; 2673 


526 


5 


65-83; 3178 93-107; 1020 137-158; 2389 172-192; 1494 
224-241; 3165 


527 


7 


38-55; 2045 125-140; 1 136 320-339; 2947 335-360; 1228 
364-386; 1097 422-437; 943 451-469; 1867 


528 


11 


118-133; 2943 199-212; 1121 230-251;2184 264-285; 1606 
302-317; 1270 343-360; 1239 422-446; 1581 457- 
472; 1460 492-51 1; 2540 503-532; 504 562-577; 1749 
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For Each Transmembrane Domain, its Transmembrane Domain 
Position in SEQ ID NO; and its TM Pred Score 


529 


4 


81-108; 674 150-166; 1423 300-315; 1978 486-501; 799 


530 


6 


27-43; 974 66-85; 1887 98-1 14; 1 177 120-142; 1864 
163-180; 871 208-225; 2625 


531 


4 


88-104; 2727 112-137; 1466 152-173; 1863 195-216; 1523 


532 


8 


55-71; 2368 82-96; 847 1 17-141; 1703 161-180; 1265 
218-237; 2278 265-281; 1248 297-313; 748 325- 
346; 1097 


533 


3 


471-484; 505 578-593; 1235 605-619; 981 


534 


10 


50-67; 900 188-207; 2528 219-237; 1166 243-261; 1710 
297-313; 1043 334-356; 2733 380-398; 1300 408- 
427; 1509 471-490; 1446 556-573; 617 


535 


7 


410-425; 2180 656-671; 1017 692-71 1; 1695 717-735; 898 
751-767; 2256 773-789; 1341 809-824; 2908 


536 


7 


433^148; 2180 679-694; 1017 715-734; 1695 740-758; 898 
774-790; 2256 796-8 1 2; 1 34 1 832-847; 2908 


537 


1 


66-88; 2934 


538 


7 


26-51; 1782 61-83; 603 91-120; 1188 140-154; 1223 
198-226; 2284 245-260; 1580 273-292; 1207 


539 
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27-39; 1 172 50-65; 1681 80-104; 1084 109-138; 1616 
151-163; 1311 165-188; 1247 200-215; 971 


540 
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100-116; 1881 135-156; 1002 
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126-145; 939 142-165; 508 680-701; 2775 
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26-44; 863 


544 
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83-99; 2738 
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11 


25-40; 737 250-267; 2877 277-299; 1267 325-342; 1801 
357-370; 1 156 440-459; 2243 702-720; 1515 729- 
746; 2454 755-770; 589 799-821; 241 1 836-850; 1194 


546 


6 


30-46; 1302 49-69; 1510 76-90; 1070 104-123; 1711 
147-160; 1419 186-202; 2239 


547 
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55-70; 1001 95-1 17; 1013 386-406; 973 664-682; 599 
1655-1668; 1126 


548 


1 


82-101; 3223 
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3 


55-73; 2750 79-96; 1280 1 15-129; 1733 


550 
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25-48; 2164 61-75; 774 91-120; 1887 140-158; 937 
199-219; 2862 245-260; 1258 273-292; 1715 330- 

345; 782 


551 


13 


334-354; 586 480495; 1208 509-529; 1 145 565-581; 1273 
593-61 1; 1007 695-710; 1443 730-748; 1753 784- 

800; 1657 826-846; 2236 882-900; 1281 885-913; 1566 
902-926:923 972-989; 1888 


552 


9 


54-76; 2605 103-118; 984 130-150; 2154 160-175; 1065 
199-216; 3177 225-239; 1416 262-282; 1291 299- 
314; 1383 325-342; 2377 
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SEQ ID NO: 


SEQ ID NO: 


SEQ ID NO: 


SEQ ID NO: 
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Table 9 



SEQ ID NO: 
of fulMength 
nucleotide 


SEQ ID NO: 
of full-length 
peptide 


SEQ ID NO: 
of contig 
nucleotide 


SEQ ED NO: 
of contig 
peptide 


Identification of 
Priority Application 
that contig nucleotide 


sequence 


sequence 


sequence 


sequence 


sequence was filed 
(Attorney Docket 

INO. alLfJ III NU.) 


1 AA 

IOO 


1*7 < 
3/0 


All 

033 


QCl 

553 


1A1 1 £ 

/VI 10 


1 A| 

101 


3// 


034 


554 


*7AA 1/CCCQ 

/yo 2o55y 


1 AT 
102 


no 
3/0 


035 


QCC 

555 


*7aa i/:cco 

/yo 2055y 


1 A1 

103 


11 A 

379 


030 


850 


non AC/K 
/8/ y540 


1 A /I 

104 


10A 
380 


03/ 


55/ 


7QA tZf\A*7 

/54 004/ 


1 AC 


101 
381 


JCIO 

038 


oco 
858 


"7Q/I 1C1A 

/54 2520 


1 AiC 

106 


101 
352 


039 


OCA 

85y 


HQ A 1/IA1 

/54 3402 


107 


101 
383 


040 


800 


ioa c 1 ill 
/54 5142 


1 AO 

108 


384 


041 


OiCI 

801 


*1QA /UJ1A 

784 4030 


1 AA 

109 


IOC 

385 


£/ii 
042 


0£ 1 

802 


"70*7 1 A1 1 

787 1021 


t 1 A 

110 


1 Odl 

386 


643 


863 


787 1021 ; 


111 


387 


£.AA 

644 


0£ A 

864 


IOA ACA1 ' 

784 4543 


112 


100 
388 


^4 c 

045 


805 


787 4013 \ 


113 


389 


£.A£. 

646 


0£.c 

866 


"TO it \ 1 4VT 

784 1107 


114 


^ AA 

390 


647 


867 


790 14636 


115 


391 


/■in 

648 


868 


787 3544 


116 


392 


649 


869 


"TO A IIO 1 

784 2281 


117 


393 


650 


870 


784 4265 


118 


394 








1 1 A 

119 


*> AC 

395 


/"CI 

651 


871 


TO A IOOC 

784 1885 


120 


396 


i?C*^ 

652 


872 


790 2819 


Alt 

121 


"i AT 

397 


653 


873 


784 7981 


122 


*> AO 

398 


654 


874 


70C 1A11 

785 2923 


i n 
123 


ion 

399 


iCCC 

055 


01c 
875 


IOA ACOt\ 

784 4589 


124 


/I A A 

400 








11C 

125 


401 


050 


870 


1AA l^ylAT 

/y0 2040/ 


I/O 


AM 

402 


rftC"7 

05/ 


on 
8// 


*7AA OA11 

/yo 8012 


in 
12/ 


403 


055 


070 
5/5 


/yi 131 


125 


A(\A 

4U4 


^ca 
05y 


5/y 


IOA 1 tVX 1 0 

/yo 103 iy 


11Q 


405 


££A 

000 


HQ A 

550 


1 A A 1 QAAQ 

/yo 1 504 y 


1 1A 

130 


400 


££1 
001 


OOI 
551 


•70 a AQf^^ 

/5y 4yoi 


ill 


ACM 
4VJ/ 








ill 


405 


002 


001 
552 


nQA AQtl 
/54 4513 


133 


A no 
4UV 








1 1A 

134 


/II A 

410 


003 


001 
553 


7QA 10"7*7 

/54 3y// 


1 1< 
135 


/ii 1 
41 1 


004 


554 


10 >i i cm 
/54 350/ 


tin 
130 


/in 
412 


005 


00c 
585 


TO/l OIAI 

784 8101 


13/ 


/in 
413 


000 


880 


70J 1K1 

784 1203 


110 

138 


414 


007 


OOI 

887 


*7A1 IAOI 

791 3081 


110 
135/ 


/I1C 

415 


005 


000 
858 


*7A1 CI AT 

792 5307 




41a 


Uv/ 


007 


7514 117 


141 


417 


670 


890 


790 311 


142 


418 


671 


891 


784 3298 


143 


419 


672 


892 


788 2631 


144 


420 


673 


893 


788 2631 1 


145 


421 








146 


422 


674 


894 


787 2204 


147 


423 


675 


895 


787 4220 


148 


424 


676 


896 


784 1948 


149 


425 


677 


897 


791 2929 



WO 03/025148 



PCT/US02/29964 



297 

Table 9 



SEQ ID NO: 


SEQ ID NO: 
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SEQ ID NO: 


Identification of 
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sequence was filed 
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y4/ 


7rtA i iCnoiC 
790 loyoo 


1 1 ft 
ZlO 


404 








910 
z i y 


40 s 


77ft 
/Z5 


y4o 


7QC 77« 


990 










991 

ZZ 1 


407 
*»y / 


790 
/zy 


040 

y*iy 


HQ A 77/1 Q 

/o4 zz4o 


222 


40R 


770 
/JU 


0«»ft 

yju 


7QA 7<7/l< 

/yu ZD34D 


992 


400 


721 


0S1 
yji 


7R4 SftA7 
/o4 DUOz 


994 


son 


729 
1 oc 


OS9 

yjz 


7fiO ft 17 
/oV ol / 


225 


SOI 

Jul 








226 


509 


722 


0S2 


7ft7 ftftlft 
lot OOlU 


227 


503 


724 


0S4 

yj*t 


7R7 1 S79 
tot 13/Z 


228 


504 


72S 


OSS 


70ft 1790#» 
/yu izzyo 


229 


505 


736 


OSfi 


70ft 97171 
/yu z / 1 / j 


230 


506 


727 


0S7 

7J / 


7R4 1S71 


231 


S07 


72R 

/JO 


OSR 


7R4 174£ 


232 


508 


720 


OSO 

yjy 


7ft4 1 ft07 
/ o*f I uy / 


212 


500 








234 ! 


510 








235 


S1 1 


740 


Q&O 
you 


7fl4 S07A 

/ 54 DyzO 


236 


S19 1 








237 


512 








238 


514 


741 


QA1 
yoi 


7ft4 SI 1ft 
/o4 3Jlo 


239 


SIS 


749 


0*59 
yoz 


70ft 177^ft 

/yu iz/Do 


240 


516 


743 


963 


784 5328 


241 


517 








242 


518 


744 


964 


785 507 


243 


519 


745 


965 


789 4217 


244 


520 


746 


966 


791 2641 


245 


521 


747 


967 


790 23507 


246 


522 


748 ~\ 


968 


784 2608 


247 


523 


749 ; 


969 


787 84 


248 


524 


750 


970 


790 16983 


249 


525 
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Table 9 



SEQ ID NO: 
of full-length 
nucleotide 
sequence 


SEQ ID NO: 
of full-length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID NO: 
of contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
S No, SEQ ID NO.) * 


250 


CSC 

526 








OC 1 


527 








252 


oo 
528 


751 


971 


787 4538 


253 


529 


752 


972 


784 4452 


254 


CIA 

530 


753 


973 


lOA O Af\C 

784 3405 




C1 1 

531 


*?C 

754 


("1*7/1 

974 


TOT <\nf^ 

787 2752 




cn 
532 








A J / 


en 








ICO 


5J4 


755 


975 


785 1541 


ocn 


coc 
535 


756 


976 


784 4406 


26U 


CUC 

530 


757 


977 


784 4406 


261 


537 


*f CO 

758 


A"TO 

978 


785 33 


262 


538 


*7CA 

759 


979 


787 5204 


263 


CO A 

539 


760 


980 


784 482 


264 


540 


761 


981 


787 6564 


265 


541 


762 


982 


ion /o ii 

788 6847 


266 


542 


763 


983 


785 1239 


zo/ 


D**3 


•7/^4 


(MM 

yon 


*71M >f A/Cft 

/o4 4Uoy 


268 


544 


765 


985 


785 1321 


269 


545 


766 


986 


785 658 


270 


546 


767 


987 


787 3324 


271 


547 


768 


988 


784 10120 


272 


548 


769 


989 


787 10039 


273 


549 


770 


990 


787 9881 


274 


550 








275 


551 


771 


991 


789 1858 


276 


552 


772 


992 


784 10115 



*784_XXX = SEQ ID NO: XXX of Attorney Docket No. 784, US Serial No. 09/488,725 
filed 01/21/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

785_XXX = SEQ ID NO: XXX of Attorney Docket No. 785, US Serial No. 09/491,404 
filed 01/25/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

787_XXX = SEQ ID NO: XXX of Attorney Docket No. 787, US Serial No. 09/496,914 
filed 02/03/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

788_XXX = SEQ ID NO: XXX of Attorney Docket No. 788, US Serial No. 09/515,126 
filed 02/28/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

789JCXX = SEQ ID NO: XXX of Attorney Docket No. 789, US Serial No. 09/519,705 
filed 03/07/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 
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790_XXX - SEQ ID NO: XXX of Attorney Docket No. 790, US Serial No. 09/540,217 
filed 03/31/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

791JCXX = SEQ ID NO: XXX of Attorney Docket No. 791, US Serial No. 09/552,929 
filed 04/18/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

792_XXX = SEQ ID NO: XXX of Attorney Docket No. 792, US Serial No. 09/577,408 
filed 05/18/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 
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Table 10 



ott^i 11/ fNU of r uii-iengtn 
Nucleotide Sequence 


CPA I TV 1VJS\ ~ f 17. .11 — 

btLKi ID NO of h uii-iengtn 
Peptide Sequence 


SEQ ID NO in 
Priority Application 

U5>MN 00/323,739 








i 
i 


777 
Z / / 


1 
1 


9 

z 


979 
Z/O 


9 
Z 


•i 
j 


770 

z/y 


3 


A 
H 


9flA 
ZoU 


4 


C 


7Q 1 
Zol 


5 


0 


7Q7 

zoz 


0 


/ 


701 
Z53 


7 


o 
o 


Zo4 


o 
8 


o 

y 


7QC 
Zo5 


9 


1 A 


ton 

ZOO 


10 


1 1 


287 


11 


1 7 
1Z 


too 
ZOO 


12 




ion 

289 


13 


1 A 

14 


290 


14 


15 


291 


15 


10 


292 


16 


1 *7 
1 / 


293 


17 


1 Q 

to 


294 


18 




295 


19 


ZU 


29o 


20 


9 1 
ZI 


29/ 


21 


zz 


7AQ 

298 


22 


71 
Zj 


7QO 

zyy 


23 


9/1 

z** 


inn 
3UU 


24 


9^ 


iai 
3UI 


7C 

25 


7A 

zo 


iai 
3UZ 


26 


77 
z / 


1A1 
3U3 


2/ 


Zo 


31/4 


28 


Z7 


1A^ 
3U5 


7fl 

29 


in 


1A< ] 

3U0 


30 




1A7 
3U/ 


i i 
31 


jZ 


lAft 
3U5 


32 


11 


1AO 

3uy 


11 

33 




11 A 
3IU 


i>i 
34 


1« 
JJ 


111 
311 


1C 

35 


1ft 


119 ' 
3IZ 


lit 

3o 


17 


111 i 
313 


37 


1ft 
Jo 


1 1 A 

3 14 j 


38 


70 

jy 


1 1 < 
313 


39 


Aft 


11 A. 
310 


40 


41 


117 
31/ 


A 1 

41 


HZ 


lift 


42 


43 


319 


41 


44 | 


320 


44 


45 


321 


45 


46 


322 


46 


47 


323 


47 | 


48 


324 


48 


49 


325 


49 


50 


326 


50 


51 


327 


51 


52 


328 


52 
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SEQ ID NO of Full-length 


SEQ ID NO of Full-length 


SEQ ID NO in 


Nucleotide Sequence 


Peptide Sequence 


Priority Application 






USSN 60/323,739 


33 


sly 


53 


34 


330 


54 


fC 

33 


331 


Cf 

55 


CA 


332 


56 


3/ 


333 


C*"I 

57 


CQ 
30 


334 


58 


en 

jy 


335 


Cft 

59 


ou 


336 


60 


ai 
01 


337 


61 


AO 
OA 


338 


62 


AI 

03 


3351 


63 


04 


340 


64 


03 


1A 1 

341 


65 


AA 
00 


342 


66 


0/ 


1A1 

343 


67 


AO 
Do 


344 


68 


65J 


345 


69 


/0 


1 AH 

346 


70 


"71 
/l 


347 


71 


*7") 


"> AO 

348 


72 


15 


349 


73 


74 


350 


74 


75 


351 


75 


76 


352 


76 


77 


353 


77 


/o 


354 


78 


/y 


ice 

355 


79 


oU 


356 


80 


Q 1 
Of 


357 1 


81 


OA 


ICO 

358 


82 


81 
OJ 


33y 


83 


CM 

o*+ | 


300 


OA 

84 


OO 


1A1 
301 


oc 

85 


ft A. 
50 


J Da 


o^ 

86 


5/ j 


1A1 

303 


87 ! 


oo 


J\r* 


88 | 


ftO 

o5r 


1AC 
303 


Oft 

89 


on 
yv 


7AA 
300 


90 


01 

y i 


"1A"7 i 
30/ 


Cm f 

91 


0*5 

yz 


1AO 

368 


92 


01 
y3 


365/ 


93 


Oil 

y** 


3/0 


94 


oc 
y3 


1*71 

371 


95 


96 


^ t Am 


OA 

yo 


97 


373 


97 1 


98 


374 


98 


99 


375 


99 


100 


376 


100 


101 


377 


101 


102 


378 


102 


103 


379 


103 


104 


380 


104 


105 


381 


105 
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CfA rn NO nf Villi l»naih 

omzj\£ it/ oi r uii-iengin 
nucieoimc sequence 


Ol?r| f fx MA A f l?..!! |ona#h 

oin^ iu fNU oi r uu-iengin 
repuae aecjueiice 


alLl^ lii INI/ in 

rnoniy Application 
£n/i7i 710 


1UU 


1R7 


Iftrt 


107 


1R1 

30 J 


107 




IRA 


IftR 




1R** 


IftO 

iuy 


i in 


1R/5 

JOO 


i ift 
1 iu 


1 1 1 
ill 


1R7 

jo/ 


iti 
in 


1 17 

1 1Z 


IRQ 


in 
1 1Z 


1 13 


ISO 

Joy 


1 13 


i i*i 


ion 
3yu 


ii/i 
1 14 


-i i *: 

1 13 


101 
3y 1 


1 13 


1 1 A 


1O0 

3yz 


1 1 JZ 

110 


11/ 




in 
11/ 


1 IS 
1 10 


104 


1 1 ft 
115 


1 10 
1 11/ 


10*1 


1 io 

i iy 


170 


1Q£ 


1 7fl 
IZU 


1Z1 


107 

jy / 


171 

1Z 1 


177 

1ZZ 


10R 


1 77 

1ZZ 


171 


100 


171 
1Z3 


174 


400 


174 
1Z*> 


17S 


*>U1 


IZ3 


17fi 


4ft7 


17£ 
1ZO 


177 

IX/ 


4ftl 


177 
1Z / 


17R 


4ft4 

*r\/*T 


1 7Q 
lZo 


190 


4ft*i 


1 70 

izy 


no 




i ift 

13U 


111 


4ft7 


171 
131 


1 19 


4ftR 
*>Uo 


1 17 i 
13Z 


111 
1 JJ 


4ft0 


1 11 
133 


114 


4in 


1 1A 

13*> 


11** i 


41 1 


133 


llfi 


*11Z 


1 1A. 

130 


117 
i J / 


411 


1 17 
13 / 


1 1R 

l JO 


414 


1 18 
135 


110 
» jy 


41 S 


110 
13y 


140 


41 A 


14ft 


141 


417 
*f i / 


141 
1*>1 


142 


41R 
*t 1 o 


147 i 


143 


410 

417 


141 


144 


470 
*txis 


144 
1*1*1 


145 


471 


14*4 

1*13 


146 


477 


14/; 
i*#o 


147 


421 


147 


148 


424 


14R 


149 


425 


149 


150 


426 


150 


151 


427 


151 


152 


428 


152 


153 


429 


153 


154 


430 


154 


155 


431 


155 


156 


432 I 


156 


157 


433 


157 


158 


434 


158 
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SEQ ID NO of Full-length 
Nucleotide Sequence 


SEQ ID NO of Full-length 
Peptide Sequence 


SEQ ID NO in 
rnority Application 

I1CCM /A/111 fin 

UfcSN 60/323,739 




433 


1 CO 


10U 


430 


lOU 


lol 


/in 
43 / 


101 


162 


43o 


10/ 


163 


ill A 

439 


163 


1 <A 


440 


104 


103 


441 


103 


loo 


44/ 


100 


10/ 


AA1 

443 


10/ 


105 


444 


t HQ 

loo 


169 


AAC 

445 


1 ^A 

169 i 


1 *7A 
1 /0 


446 


1 "7 A 

170 


171 


44/ 


171 


172 


A AO 

448 


172 


173 


449 


173 


174 


450 


174 


175 


451 


175 


176 


452 


176 


177 


453 


177 


178 


454 


178 


179 


455 


179 


180 


456 


180 


lOI 

lol 


A C*7 

457 


181 




ACQ 

458 


182 


183 


A CO 

459 


183 


1 QA 

154 


460 


1 OA 

184 


lo3 


461 


185 


150 


402 


loo 


15/ 


403 


15/ 


1 Qfi 

155 


404 


155 


I5y 


403 


159 


ion 
iyu 


400 


1 AA 

iyo 


toi 
iy i 


40/ 


101 

191 


1 00 

iy/ 


405 


iyz 


iy3 


409 


193 


1 QA 

194 


4/U ! 


194 


193 


4/1 1 


1 AC 

193 


iyo 


4/2 


196 


19/ 


4/3 


1 A*7 

197 


1 Qfi 

195 


4/4 j 


1 AO 

195 


1 no 

199 


AtC 

4/3 


199 


inn 


4/0 


200 


2U1 


4// 


iai 
201 


202 


478 


7 ft? 


203 


479 


203 


204 


480 


204 ( 


205 


481 


205 | 


206 


482 


206 ! 


207 


483 


207 


208 


484 


208 | 


209 


485 


209 


210 


486 


210 


211 


487 


211 
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SEQ ID NO of Full-length 


SEQ ID NO of Full-length 


SEQ ID NO in 


Nucleotide Sequence 


Peptide Sequence 


Priority Application 






USSN 60/323,739 


212 


488 


212 


213 


489 


213 


214 


490 


214 


215 


491 


215 


216 


492 


216 


217 


493 


217 


218 


494 


218 


219 


495 


219 


220 


496 


220 


221 


497 


221 


222 


498 


222 


223 


499 


223 


224 


500 


224 


225 


501 


225 


226 


502 


226 


227 


503 


227 


228 


504 


228 


229 


505 


229 


230 


506 


230 


231 


507 


231 


232 


508 


232 1 


233 


509 


233 


234 


510 


234 


235 


511 


235 


236 


512 


236 


237 


513 


237 


238 


514 


238 


239 


515 


239 | 


240 


516 


240 


241 


517 


241 


242 


518 


242 


243 


519 


243 


'I A A 

244 


520 


244 


245 


521 


245 


1AC 

Z4o 


522 


246 


24/ 


523 


247 


24 o 


524 


248 | 


2451 


525 


249 


OCA 

23U 


526 


250 | 


TCI 

251 


527 


251 ! 


1C1 

252 


528 


252 


253 


529 


253 


254 


530 


254 






255 


256 


532 


256 


257 


533 


257 


258 


534 


258 


259 


535 


259 


260 


536 


260 


261 


537 


261 | 


262 


538 


262 


263 


539 


263 


264 


540 


264 



WO 03/025148 



PCT/US02/29964 



306 
Table 10 



QJ?f\ tr\ "KJf\ n.f Call I Ann ik 

Nucleotide Sequence 


CPA IT. XT aT», -4*1?.. II Innntk 

MivJ ID THl) of rull-iengtn 
Peptide Sequence 


SEQ ID NO in 
rnority Application 


ZO_> 


1 


Z03 


ZOO 




ZOO 


267 


543 


267 


268 


544 


268 


269 


545 


269 


270 


546 


270 


271 


547 


271 


272 


548 


272 | 


273 


549 


273 


274 


550 


274 


275 


551 


275 


276 


552 


276 



